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Foreword 


The author’s intention was: 


e to select and expose subjects that can be necessary or useful to those in- 
terested in stochastic calculus and pricing in models of financial markets 
operating under uncertainty; 


e to introduce the reader to the main concepts, notions, and results of stochas- 
tic financial mathematics; 


e to develop applications of these results to various kinds of calculations re- 
quired in financial engineering. 


The author considered it also a major priority to answer the requests of teachers 
of financial mathematics and engineering by making a bias towards probabilistic and 
statistical ideas and the methods of stochastic calculus in the analysis of market 
risks. 

The subtitle “Facts, Models, Theory” appears to be an adequate reflection of 
the text structure and the author’s style, which is in large measure a result of the 
‘feedback’ with students attending his lectures (in Moscow, Zürich, Aarhus, ... ). 

For instance, an audience of mathematicians displayed always an interest not 
only in the mathematical issues of the ‘Theory’, but also in the ‘Facts’, the par- 
ticularities of real financial markets, and the ways in which they operate. This 
has induced the author to devote the first chapter to the description of the key 
objects and structures present on these markets, to explain there the goals of finan- 
cial theory and engineering, and to discuss some issues pertaining to the history of 
probabilistic and statistical ideas in the analysis of financial markets. 

On the other hand, an audience acquainted with, say, securities markets and 
securities trading showed considerable interest in various classes of stochastic pro- 
cesses used (or considered as prospective) for the construction of models of the 
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dynamics of financial indicators (prices, indexes, exchange rates, ...) and impor- 
tant for calculations (of risks, hedging strategies, rational option prices, etc.). 

This is what we describe in the second and the third chapters, devoted to sto- 
chastic ‘Models’ both for discrete and continuous time. 

The author believes that the discussion of stochastic processes in these chapters 
will be useful to a broad range of readers, not only to the ones interested in financial 
mathematics. 

We emphasize here that in the discrete-time case, we usually start in our de- 
scription of the evolution of stochastic sequences from their Doob decomposition 
into predictable and martingale components. One often calls this the ‘martingale 
approach’. Regarded from this standpoint, it is only natural that martingale theory 
can provide financial mathematics and engineering with useful tools. 

The concepts of ‘predictability’ and ‘martingality’ permeating our entire expo- 
sition are incidentally very natural from economic standpoint. For instance, such 
economic concepts as investment portfolio and hedging get simple mathematical def- 
initions in terms of ‘predictability’, while the concepts of efficiency and absence of 
arbitrage on a financial market can be expressed in the mathematical language, by 
making references to martingales and martingale measures (the First fundamental 
theorem of asset pricing theory; Chapter V, § 2b). 

Our approach to the description of stochastic sequences on the basis of the Doob 
decomposition suggests that in the continuous-time case one could turn to the 
(fairly broad) class of semimartingales (Chapter III, §5a). Representable as they 
are by sums of processes of bounded variation (‘slowly changing’ components) and 
local martingales (which can often be ‘fast changing’, as is a Brownian motion, for 
example), semimartingales have a remarkable property: one can define stochastic 
integrals with respect to these processes, which, in turn, opens up new vistas for the 
application of stochastic calculus to the construction of models in which financial 
indexes are simulated by such processes. 

The fourth (‘statistical’) chapter must give the reader a notion of the statistical 
‘raw material’ that one encounters in the empirical analysis of financial data. 

Based mostly on currency cross rates (which are established on a global, prob- 
ably the largest, financial market with daily turnover of several hundred billion 
dollars) we show that the ‘returns’ (see (3) in Chapter II, § 1a) have distribution 
densities with ‘heavy tails’ and strong ‘leptokurtosis’ around the mean value. As 
regards their behavior in time, these values are featured by the ‘cluster property’ 
and ‘strong aftereffect’ (we can say that ‘prices keep memory of their past’). We 
demonstrate the fractal structure of several characteristic of the volatility of the 
‘returns’. 

Of course, one must take all this into account if one undertakes a construction 
of a model describing the actual dynamics of financial indexes; this is extremely 
important if one is trying to foresee their development in the future. 

‘Theory’ in general and, in particular, arbitrage theory are placed in the fifth 
chapter (discrete time) and the seventh chapter (continuous time). 
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Central points there are the First and the Second fundarnental asset pricing 
theorems. 

The First theorem states (more or less) that a financial market is arbitrage-free 
if and only if there exists a so-called martingale (risk-neutral) probability measure 
such that the (discounted) prices make up a martingale with respect to it. The 
Second theorem describes arbitrage-free markets with property of completeness, 
which ensures that one can build an investment portfolio of value replicating 
faithfully any given pay-off. 

Both theorems deserve the name fundamental for they assign a precise mathe- 
matical meaning to the economic notion of an ‘arbitrage-free’ market on the basis 
of (well-developed) martingale theory. 

In the sizth and the eighth chapters we discuss pricing based on the First and the 
Second fundamental theorems. Here we follow the tradition in that we pay much 
attention to the calculation of rational prices and hedging strategies for various 
kinds of (European or American) options, which are derivative financial instru- 
ments with best developed pricing theory. Options provide a perfect basis for the 
understanding of the general principles and methods of pricing on arbitrage-free 
markets. 

Of course, the author faced the problem of the choice of ‘authoritative’ data and 
the mode of presentation. 

The above description of the contents of the eight chapters can give one a mea- 
sure for gauging the spectrum of selected material. However, for all its bulkiness, 
our book leaves aside many aspects of financial theory and its applications (e.g., the 
classical theories of von Neumann—Morgenstern and Arrow~Debreu and their up- 
dated versions considering investors’ behavior delivering the maximum of the ‘utility 
function’, and also computational issues that are important for applications). 

As the reader will see, the author often takes a lecturer’s stance by making 
comments of the ‘what-where-when’ kind. For discrete time we provide the proofs 
of essentially all main results. On the other hand, in the continuous-time case 
we often content ourselves with the statements of results (of martingale theory, 
stochastic calculus, etc.) and refer to a suitable source where the reader can find 
the proofs. 

The suggestion that the author could write a book on financial mathematics for 
World Scientific was put forward by Prof. Ole Barndorff-Nielsen at the beginning 
of 1995. Although having accepted it, it was not before summer that the author 
could start drafting the text. At first, he had in mind to discuss only the discrete- 
time case. However, as the work was moving on, the author was gradually coming 
to the belief that he could not give the reader a full picture of financial mathematics 
and engineering without touching upon the continuous-time case. As a result, we 
discuss both cases, discrete and continuous. 

This book consists of two parts. The first (‘Facts. Models’) contains Chap- 
ters | IV. The second (‘Theory’) includes Chapters V-VIII. 
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The writing process took around two years. Several months went into 
typesetting, editing, and preparing a camera-ready copy. This job was done by 
I. L. Legostaeva, T. B. Tolozova, and A. D. Izaak on the basis of the Information 
and Publishing Sector of the Department of Mathematics of the Russian Academy 
of Sciences. The author is particularly indebted to them all for their expertise and 
selfless support as well as for the patience and tolerance they demonstrated each 
time the author came to them with yet another ‘final’ version, making changes in 
the already typeset and edited text. 

The author acknowledges the help of his friends and colleagues, in Russia and 
abroad; he is also grateful to the Actuarial and Financial Center in Moscow, 
VW-Stiftung in Germany, the Mathematical Research Center and the Center for 
Analytic Finance in Aarhus (Denmark), INTAS, and the A. Lyapunov Institute in 
Paris and Moscow for their support and hospitality. 
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Aims and Problems of Financial Theory 
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1. Financial Structures and Instruments 


In modern view (see, e.g., [79], [342], and [345]) financial theory and engineering 
must analyze the properties of financial structures and find most sensible ways to 
operate financial resources using various financial instruments and strategies with 
due account paid to such factors as time, risks, and (usually, random) environment. 


Time, dynamics, uncertainty, stochastics: it is thanks to these elements that 
probabilistic and statistical theories, such as, e.g., 


the theory of stochastic processes, 
stochastic calculus, 

statistics of stochastic processes, 
stochastic optimization 


make up the machinery used in this book and adequate to the needs of financial 
theory and engineering. 


§la. Key Objects and Structures 


1. We can distinguish the following basic objects and structures of financial theory 
that define and explain the specific nature of financial problems, the aims, and the 
tools of financial mathematics and engineering: 


individuals, 
corporations, 
intermediaries, 
financial markets. 


4 Chapter I. Main Concepts, Structures, and Instruments 


individuals en ne as corporations 


intermediarics 


As shown in the chart, the theory and practice of finance assign the central role 
among the above four structures to financial markets; they are the structures of 
she primary concern for the mathematical theory of finance in what follows. 


2. Individuals. Their financial activities can be described in terms of the dilemma 
‘consumption investiment’. The ambivalence of their behavior as both consumers 
(‘consume more now’) and investors (‘invest now to get more in the future’) brings 
one to optimization problems formulated in mathematical economics as consumpt- 
ion—saving and portfolio decision making. In the framework of utility theory the 
first problem is treated on the basis of the (von Neumann—Morgenstern) postu- 
lates of the rational behavior of individuals under uncertainty. These postulates 
determine the approaches and methods used to determine preferable strategies by 
means of a quantitative analysis, e.g., of the mean values of the utility functions. 
The problem of ‘portfolio decision making’ confronting individuals can roughly be 
described as the problem of the best allocation (investment) of funds (with due 
attention to possible risks) among, say, property, gold, securities (bonds, stock, op- 
tions, futures, etc.), and the like. The idea of diversification (see § 2b) in building 
a portfolio is reflected by such well-known adages as “Don’t put all your eggs in 
one basket” or “Nothing ventured, nothmg gamed”. In what follows we describe 
various opportunities (depending on the starting capital) opening for an individual 
on a securities niarket. 

Corporations (coinpanies, firms), who own such ‘perceptible’ valuables as 
‘land’, ‘factories’, ‘machines’, but are also proprietors of ‘organization structures’, 
‘patents’, etc., organize businesses, maintain business relations, and manage manu- 
facturing. To raise funds for the development of manufacturing, corporations occa- 
sionally issue stock or bonds (which governments also do). Corporate management 
must be directed towards meeting the interests of shareholders and bondholders. 
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Intermediaries (intermediate financial structures). These are banks, mvest- 
ment companies (of mutual funds kind), pension funds, insurance companies, etc. 
One can put here also stock exchanges, option and futures exchanges, and so on. 


Among the world’s most renowned exchanges (as of 1997) one must list the 
USA-based NYSE (the New York Stock Exchange), AMEX (the American Stock 
Exchange), NASDAQ (the NASDAQ Stock Market), NYFE (the New York Futures 
Exchange), CBOT (the Chicago Board of Trade), etc.% 


3. Financial markets include money and Forex markets, markets of precious 
metals, and markets of financial instruments (including securities). 


In the market of financial instruments one usually distinguishes 


e underlying (primary) instruments and 
e derivative (secondary) instruments; 


the latter are hybrids constructed on the basis of underlying (more elementary) 
instruments. 


The underlying financial instruments include the following securities: 


e bank accounts, 
e bonds, 
e stock. 


The derivative financial instruments include 


options, 

futures contracts, 
warrants, 

swaps, 
combinations, 
spreads, 


We note that financial engineering is often understood precisely as manipula- 
tions with derivative securities (in order to raise capital and reduce risks caused by 
the uncertain character of the market situation in the future). 


We now describe several main ingredients of financial markets. 


"Throughout the book we turn fairly often to American financial structures and to 
financial activities taking place in the USA. The main reason is that the American fi- 
nancial markets have deep-rooted traditions (‘the Wall Street’!) and, at the same time, 
these are the markets where many financial innovations are put to test. Moreover, there 
exists an enormous literature describing these markets: periodicals as well as monographs, 
primers, handbooks, investor’s guides, and so on. The reader can easily check through the 
bibliography at the end of this book. 
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1. Money dates back to those ages when people learned to trade ‘things they had’ 
for ‘things they wanted to get’. This mechanism works also these days—we give 
money in exchange for goods (and services), and the salesmen, in turn, use it to 
buy other things. 

Modern technology means a revolution in the ways of money circulation. Only 
8% of all dollars that are now in circulation in the USA are banknotes and coins. 
The main bulk of payments in retailing and services are carried over by checks and 
plastic cards (and proceed by wires). 

Besides its function of a ‘circulation medium’, money plays an important role 
of a ‘measure of value’ and a ‘saving means’ [108]. 


2. Foreign currency, i.e., the currency of other nations (its reserves, cross rates, 
and so on) is an important indicator of the nation’s well-being and development 
and is often a means of payment in foreign trade. 

Economic globalization brings into being monetary unions of several nations 
agreeing to harmonize their monetary and credit policies and regulating exchange 
rates between their currencies. 

One example is the well-known Bretton-Woods credit and currency system. 
In 1944, Bretton-Woods (New Hampshire, USA) hosted a conference of the major 
participants in the international trade, who agreed to maintain a currency system 
(the ‘Bretton-Woods system’) in which the exchange rates could deviate from their 
officially declared levels only in very narrow ranges: 1% in either side. (These parity 
cross rates were fixed against the USA dollar.) To launch this system and oversee it, 
the nations concerned established the International Monetary Foundation (IMF). 

However, with the financial and currency crisis of 1973 affecting the major cur- 
rencies (the USA dollar, the German mark, the Japanese yen), the Bretton-Woods 
system was acknowledged to have exhausted its potentialities, and it was replaced 
by floating exchange rates. 

In March 1979, the majority of the European Union (EU) nations created the 
European Monetary System. It is stipulated in this system that the variations 
of the cross rates of the currencies involved should lie, in general, in the band of 
+2.25% around the official parity rates. If the cross rates of some currencies are 
deemed under the threat of leaving this band, then the corresponding central banks 
must, intervene to prevent this course of events and ensure the stability of exchange 
rates. (This explains why one often calls systems of this kind ‘systems of adjusted 
floating rates’.) 

Other examples of monetary unions can be found between some Caribbean, Cen- 
tral American, and South American, countries, which peg exchange rates against 
some powerful ‘leader currency’. (For greater detail, see [108; pp. 459-468].) 


3. Precious metals, i.e., gold, silver, platinum, and some other (namely, the 
metals in the platinum group: iridium, osmium, palladium, rhodium, ruthenium) 
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played throughout the history (in particular, in the 19th and the beginning of the 
20th century) and still play today an important role in the international credit and 
currency system. 

According to [108] the age of gold standard began in 1821, when Britain pro- 
claimed pound sterling convertible into gold. The United States did the same soon 
afterwards. Tho gold standard came into full blossom in 1880-1914, but it could 
never recover its status after the World War I. Its traces evaporated completely in 
1971, when the US Treasury abandoned formally the practice of buying and selling 
gold at a fixed price. 

Of course, gold keeps an important position in the international currency system. 
For example, governments often use gold to pay foreign debts. 

It follows from the above that one can clearly distinguish three phases in the 
development of the international monetary system: ‘gold standard’, ‘the Bretton- 
Woods system’, and the ‘system of adjusted floating exchange rates’. 


4. Bank account. We can regard it as a security of bond kind (see subsection 5 
below) that reduces in effect to the bank’s obligation to pay certain interest on the 
sum put into one’s account. We shall often consider bank accounts in what, follows, 
primarily because it is a convenient ‘unit of measurement’ for the prices of various 
securities. 
One usually distinguishes two ways to pay interest: 
e m times a year (simple interest), 
e continuously (compound interest). 
If you open a bank account paying interest m times a year with interest rate 
r(m), then on having put an initial capital Bo, in N years you obtain the amount 


` mN 
By(m) = Bo(1 + m) ; (1) 


while in N + k/m years’ time (a fractional value, 0 < k < m), your capital will be 


By+kjm(m) = Bo (1 F 


In the case of compound interest with interest rate r(oo) the starting capital Bo 
grows in N years into 
By (co) = Boe ON, (2) 


Clearly, 


as r(rn) > r(oo) and m > oo. 
If the compound interest r(oo) is equal to r, then the adequate ‘rate of interest 
payable m times a year’ r(m) can be found by the formula 


r(m) = m(e"/™ ~1), (3) 
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while the compound rate r = r(oo) corresponding to fixed r(m) can be found by 


the formula 
r(m) 


r=min(1 + $), (4) 


In the particular case of m = 1, setting F = r(1) and r = r(co) we obtain the 
following conversion formulas for these rates: 


rF=e -1, r=lmn(1 +F). (5) 


Besides the ‘annual interest rate 7’, the bank can announce also the value of the 
‘annual discount rate Q’, which means that one must put Bo = By(1—@) into a 
bank account to obtain an amount By = Bı(1) a year later. The relation between 
F? and @ is straightforward: 

(l-@)(1+F)=1, 


therefore 


13449 | 13465 | 13482 
14641 14869 | 14894 | 14918 
16419 | 16453 | 16487 
18131 | 18176 | 18221 
20022 | 20079 20138 | 
22109 | 22182 22255 
24414 | 24504 | 24596 
26960 | 27070 | 27183 


Ol] ol] nN mi ot &} wl] vw 
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10 | 25987 | 26533 


One can also ask about the time necessary (under the assumption of continuously 
payable interest r = a/100) to raise our capital twofold. Clearly, we can determine 


N from the relation 
2 = eaN/100 
bam ? 
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Le., 


y= P2 100 70 
a a 


In practice (in the case of interest payable, say, twice a year) one often uses the 
so-called rule 72: if the interest rate is a/100, then capital doubles in 72/a years. 

To help the reader make an idea of the growth of an investment for various 
modes of interest payment (m = 1, 2,3, 4,6,12,00) we present a table (see above) 
of the values of By(m) (here Bo = 10000) corresponding to t = 0,1,...,10 and 
r(m) = 0.1 for all m. 


5. Bonds are promissory notes issued by a government or a bank, a corporation, 
a joint stock company, any other financial establishment to raise capital. 

Bonds are fairly popular in many countries, and the total funds invested in bonds 
are larger than the funds invested in stock or other securities. Their main attraction 
(especially for a conservative investor) is as follows: the interest on bonds is fixed 
and payable on a regular basis, and the repayment of the entire loan at a specified 
time is guaranteed. Of course, one cannot affirm beyond all doubts that government 
or corporate bonds are risk-free financial instruments. A certain degree of risk is 
always here: for instance, the corporation may go bust and default on interest 
payments. For that reason government bonds are less risky than corporate ones, 
but the coupon yield on corporate bonds is larger. 

An investor in bonds is naturally eager to know the riskiness of the corporations 
he considers for the purpose of bond purchase. Ratings of various issuers of bonds 
can be found in several publications (“Standard & Poor’s Bond Guide” for one). 
Corporations considered less risky (i.e., ones with higher ratings) pay lower interest. 
Accordingly, corporations with low ratings must issue bonds with higher interest 
rate to attract investors. 

We can characterize a bond issued at time t = 0 by several numerical in- 
dexes ((1)~(vii)): 

(i) the face value (par value) P(T,T), i.e., the sum payable to the holder of the 

bond at 

(ii) the maturity date (the year the bond matures) T; (the time to maturity is 

usually a year or shorter for short-term bonds, 2 to 10 years for middle-term 
ones, and T > 30 years for long-term bonds); 

(iii) the bond’s interest rate (coupon yield) re defining the dividends, the amount 

payable to its holder by the issuer, by the formula rex (face value); 

(iv) the original price P(0,T) of the bond of T-year maturity issued at time 

t=0. 


If the bond issued at time t = 0 has, say, 10-year maturity, the face value $ 1000, 
and the coupon yield r = 6/100 (= 6%), then having purchased it at the original 
(purchase) price of $1000 we shall in ten years receive the profit of $600, which is 
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the sum of the following terms: 


face value $ 1000 
+ 

interest for 10 years 1000 x 6% x 10 = $600 
purchase price $ 1000 


Of course, the holder of a bond of T-year maturity who bought it at time t = 0 
can keep it all these T years for himself collecting the interest and the face value 
(at time T). On the other hand he may consider it unprofitable to keep this bond 
till maturity (e.g., if the inflation rate rjp¢ is greater than re). In that case the 
bondholder can use his right to sell the bond (with maturity date T) at its 

(v) market value P(t, T) 


at time ¢ (in general, ¢ can be an arbitrary instant between 0 and T). 

By definition, P(0, T) is the original price of the bond at the moment of flotation, 
while P(T,T) is clearly its face value. (Both are equal to $1000 in the above 
example.) Although it is in principle possible that the market value P(t, T) is larger 
than the face value P(T, T), one typically has the inequality P(t, T) < P(T, T). 

Assume now that the bondholder decides to sell the bond two years after the 
flotation, and the market value P(2,10) (here t = 2 and T = 10) of the bond is 
$800. Then his profit from having purchased the bond and having held it for two 
years is as follows: 


market value $ 800 
eed 1000 x 6% x 2 = $120 
siie price $ 1000 
EEE A a 


Thus, his profits are in fact negative: his losses amount to $80. On the other 
hand an investor who buys this bond at its market value P(2, 10) = $800 will pocket 
the following return in 8 years: 


face value $ 1000 
+ 

interest 1000 x 6% x 8 = $480 
purchase price $ 800 
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We must emphasize that the interest re on the bond, its coupon yield, is fired 
over its life time, whereas its market value P(t, T) wavers. This is the result of 
the influence of multiple economic factors: demand and supply, interest payable on 
other securities, speculators’ activities, etc. Regarding {P(t,T)},0<t<T,asa 
random. process evolving in time we see that this is a conditional process: the value 
of P(T, T) is fixed (and equal to the face value of the bond). 

In Fig. 1 we depict possible fluctuations on the time interval 0 < t < T of the 
market value P(t, T) that takes the prescribed face value P(T, T) at the maturity 
date T. 


f i P(T, T) 


0 t T 
FIGURE 1. Evolution of the market value P(t, T) of a bond 


A frequently used characteristic of a bond is 


(vi) the current yield 
0<t<T, 


the ratio of the yearly interest and the (current) market value, 
which is important for the cornparison of different bonds. (In the above example 
Te(0, 10) = re = 6% and re(2, 10) = 6% - 1000/800 = 7.5%.) 

Another, probably the most important, characteristic of a bond, which enables 
one to estimate the returns from both the final repayment and the interest payments 
(due for the remaining time to maturity) and offers thus yet another opportunity 
for the comparison of different bonds, is 


(vii) the yield to maturity (on a percentage basis) or the profitability 


(here T ~ t is the remaining life time of the bond); its value must ensure 
that the sum of the discounted values of the interest payable on the interval 
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(t, T] and the face value is the current market value of the bond. In other 
words, p = p(T — t, T) is the root of the equation 


J reP(T,T)  P(T,T) 
PT Deo * Oto 2 


(Here we measure time in years, t = 1,2,...,T.) 

In the case when the market value P(t, T) is the same as the face value P(T,T), 
we see from this definition that p(T — t, T) = re. 

We plot schematically in Fig. 2 a typical ‘yield curve’, the graph of p(s,T) as a 
function of the remaining time s = T — t. 


p(s, T) 
(in A 


a 


0 S T 
FIGURE 2. Profitablity p = p(s, T) as a function of s =T — t 


In the above description we did not touch upon the structure of the market 
prices {P(t, T)} (regarded from the probabilistic point of view, say). We consider 
this, fairly complicated, question, further on, in Chapter III. 

Here we only note that in discussions of the structure and the dynamics of 
the prices {P(t, T)} from the probabilistic standpoint one usually takes one of the 
following two approaches: 


a) direct specification of the evolution of the prices P(t, T) (the price-based 
approach); 

b) indirect specification, when in place of the prices P(t, T), one is given the 
time structure of the yield {p(T —t,T)} or a similar characteristic, e.g., of 
‘mterest rate’ kind (the term structure approach). 


We note also that formula (6), which is a link between the price of a bond and 
its yield, is constructed in accordance with the ‘simple interest’ pattern (cf. (1) for 
m = 1). It is also easy to find a corresponding formula in the case of continuously 
payable interest (‘compound interest’; cf. (2)). 

We have already noted that bonds are floated by various establishments and for 
various purposes. 
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Corporate bonds are issued to accumulate capital necessary for fur- 
ther development or modernization, to cover operational costs, etc. 

Money obtained from issuing government bonds issued by national 
goverments or municipal bonds, issued by state governments, city coun- 
cils, etc., is used to carry out various govermental programms or projects 
(construction of roads, schools, bridges, ...) and cover budget deficits. 
In America government bonds include Treasury bonds, notes, and bills, 
which can be purchased through the Federal Reserve Banks or brokers. 


Information on corporate bonds and their properties is available in many publi- 
cations (e.g., in “New York Stock Exchange”, section “New York Exchange Bonds”, 
or in “American Stock Exchange” ). 

This infomation is organized in the form of a list of quotations of the following 
form: 


IBM-JJ15-7% of ‘01, 


which means that the bonds in question are issued by IBM, the interest is payable? 
on January, 15, and July, 15, with rate re = 7% and maturity year 2001. 

Of course, one can also find in these publicatious data on the face value of the 
bonds and their current market price. (For details, see, e.g., [469].) 


6. Shares (stock) are also issued by companies and corporations to raise funds. 
They mainly belong to one of the two types: ordinary shares (equities; common 
stock) and preference shares (preferred stock), which differ both in riskiness and 
the conditions of dividend payments. 

An owner of common stock obtains as dividends his share of the profits of 
the firm, and their amount depends on its financial successes. If the company 
goes bancrupt, then the shareholder can lose his investment. On the other hand, 
preferred stock means that the investor’s risk to lose everything is smaller and his 
dividents are guaranteed, although their amount, in general, does not increase with 
the firm’s profits. 

There also are other kinds of shares characterized by different degrees of in- 
volvement in the management of the corporation or some peculiarities of divident 
payments, and.so on. 

When buying stock (or bonds) many investors are attracted not by dividends 
(or interest ~in the case of bonds), but by an opportunity to make money from 
fluctuations of stockprices, buying cheap ahead of anyone else and selling high (also 
before the others). 

According to some estimates there are now more than 50m shareholders in the 
USA. (For instance, 2418447 investors had shares in AT&T by the end of 1992, 
with the total number of shares 1339916 615; see [357].) 


bInterest is commonly payable once a year in Europe, but twice a year, every 6 months 
iu the USA, 
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To buy or sell shares one should address a brokerage house, an investment 
company that is a member of a stock exchange. It should be noted that, although 
the number of individual stockholders increases with time, the share of individuals 
holding shares directly decreases: individuals are usually not themselves active 
on the market, but participate through institutional investors (mutual or pension 
funds, insurance companies, banks, and others, ‘betting’ on securities markets and, 
in particular, stockmarkets). 

Many countries have stock exchanges where shares are traded. Apparently, one 
of the first was the Amsterdam Stock Exchange (1602), which traded the shares 
of joint-stock companies. Traditionally, banks exert strong influence on European 
(stock) exchanges, while the American stock exchanges has been separated from 
the bank system since the 1930s. 

The two largest American exchanges (as of 1997) are the NYSE (New York 
Stock Exchange—the name under which it is known since 1817; it had 1366 seats 
m 1987) and the AMEX (American Stock Exchange, organized on the basis of the 
New York Curb Exchange founded in 1842). To make its shares eligible for trade 
on an exchange a company must satisfy certain requirements (concerning its size, 
profits, and so on). For example, the requirements of the NYSE are as follows: 
the earinings before taxes must be at least $2.5 million and the number of shares 
floated must be at least 1.1 million, of market value at least $18 million. Hence 
only the stock of well-known firms may be traded on this exchange (2089 listed 
corporations by 1993). Trade on the AMEX goes mostly in stock of medium-sized 
companies; the number of listed firms there is 841. 

The NASDAQ (National Association of Securities Dealers Automatic Quota- 
tions) Stock Market is another major American share-trading establishment; the 
trade there proceeds by electronic networks. 4700 firms- -large, medium-sized, and 
small but rapidly developing-—are registered here. 

An impressive number of firms (around 40 000, [357]) are participants of the 
OTC (Over-the-counter) securities market. This market has no premises or even 
central office. Deals are made by wire, through dealers who buy and sell shares 
on their accounts. This trade stretches over most diversified securities: ordinary 
and preferrence shares, corporate bonds, the US Government securities, municipal 
bonds, options, warrants, foreign securities. 

The main reason why firms that are small, newly established, or issuing small 
numbers of shares use OTC dealers is that they meet there virtually no (or minimal) 
restrictions on the size of assets and the like. 

On the other hand companies whose shares are eligible for trade on other ex- 
changes often go to the OTC market if the manner of bargaining and deal-making 
accepted there seems more convenient than the routine of ‘properly organized’ ex- 
changes. There can be other reasons for dealing through the OTC system, e.g., a 
firm may be reluctant to disclose its financial state, as required by big exchanges. 

Of course, it is important for investors, ‘big’ or ‘small’ alike, to have information 
on the health of firms issuing stock, stock quotations, and the dynamics of prices. 
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Information about the global state of the economy and the markets, as expressed 
by several ‘composite’, ‘generalized’ indexes, is also of importance. 

As regards indicators of the ‘global’ state of the economy, the most well-known 
among them are the Dow Jones Averages and Indexes. There are four Averages: 


e the DJIA (Dow Jones Industrial Average, taken over 30 industrial compa- 
nies); 

e the Dow Jones Transportation Average (over 20 air-carriers, railroad and 
transportation companies); 

e the Dow Jones Utility Average (over 15 gas, electric, and power companies); 

e the Dow Jones 65 Composite Average (over all the 65 firms included in the 
above three averages). 


We note that, say, the DJIA (the Dow), which is an indicator of the state of the 
‘industrial’ part of the economy and is calculated on the basis of the data on 30 large 
‘blue-chip’ companies, is not a mere arithmetic mean. The stock of corporations of 
higher market value has greater weight in the composite index, so that large changes 
in the prices of few companies can considerably change the index as a whole [310]. 

(On the backgrounds of the Dow. In 1883, Charles H. Dow began to draw up ta- 
bles of the average closing rates of nine railroad and two manufacturing companies. 
These lists gained strong popularity and paved way to “The Wall Street Journal” 
founded (1889) by Dow and his partner Edward H. Jones. See, e.g., [310].) 

Alongside the Dow Jones indexes, the following indicators are also widely used: 


e Standard & Poor’s 500 (S&P500) Index, 
the NYSE Composite Index, 

the NASDAQ Composite Index, 

the AMEX Market Value Index, 

e Value Line Index, 

e Russel 2000 Stock Index, 

e Wilshire 5000 Equity Index, 


Standard & Poor’s 500 Index is (by conrast with the DJ) calculated on the 
basis of data about many companies (500 in all= 400 industrial companies + 20 
transportation companies + 40 utilities + 40 financial companies); the NYSE Index 
comprises the stock of all firms listed at the New York Stock Exchange, and so on. 

(On the backgrounds of Standard & Poor’s. Henry Varnum Poor started pub- 
lishing yearly issues of “Poor’s Manual of Corporate Statistics” in 1860, more than 
20 years before the Dow Jones & Company’s first publication of the daily averages 
of closing rates. In 1941, Poor’s Finance Services merged with Standard Statistic 
Company, another leader in the collection and publishing of financial information. 
The result of this union, Standard & Poor’s Corporation became a major informa- 
tion service and publisher of financial statistics (see, for instance, [310])-) 

Besides various pulications one can obtain information on instantaneous ‘bid’ 
and ‘ask’ prices of stock from the NASDAQ electronic system (covering shares of 
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about 5000 firms), Reuters, Bloomberg, Knight Ridder, Telerate. Brokers can at 
any time learn about prices through dealers and get in direct touch with the ones 
whose prices seem more attractive. 

In view of economic globalization, it is important that one knows not only about 
the positions of national companies, but also about foreign ones. Such data are 
also available in corresponding publications. (Note that in the standard American 
nomenclature the attributes ‘World’, ‘Worldwide’, or ‘Global’ relate to all mar- 
kets, including the USA, while ‘International’ means only foreign markets, outside 
the USA). 

One can learn about the activities at 16 major international exchanges from 
“The Wall Street Journal” daily, which publishes in its “Stock Market Indexes” 
section the closing composite indexes of these exchanges and their realtive and 
absolute changes from the previous day. 

As everybody knows from numerous publications, even in mainstream news- 
papers, stock prices and the values of various financial indexes are permanently 
changing in a tricky, chaotic way. 

We depict the changes in the DJIA as an example: 


09.10.1989 
- 25.08.1987 


03.07.1989 


01.07.1988 


04.01.1988 
03.01.1989 


19.10.1987 


Figure 3. Dynamics of the DJIA (Dow Jones Industrial Average). 
On October 19, 1987, the day of crash, the DJIA fell by 508 points 


In the next chart, Fig. 4, we plot the daily changes S = (Sn) of the S&P500 
Index during 1982-88. It will be clear from what follows that, with an eye to the 
analysis of the ‘stochastic’ components of indexes, it is more convenient, to consider 

Th 
Sn-1 f g 
these quantities as ‘returns’ or ‘logarithmic returns’ (see Chapter I, § 1a). Their 
behavior is more ‘uniform’ than that of S = (Sn). We plot the corresponding graph 
of the values of h = (hn) in Fig. 5. 


the quantitics, h, = In rather than the S» themselves. We can interprete 
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FIGURE 4. The S&P500 Index in 1982-88 
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FIGURE 5. Daily values of hn = ln = for the S&P500 Index; based on 
n—i i 


the data in Fig. 4 
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FIGURE 6. Round-the-clock dynamics of the averaged (between the ‘bid’ 
and ‘ask’ prices) DEM/USD cross rate, August 19, 1993 (The label 0 
corresponds to 0:00 GMT) 


The dip at the end of 1987, clearly seen in Figs. 3 and 4 is related to the famous 
October crash, when stock prices fell abruptly and the investors, afraid of losing 
everything, rushed to sell. Mass sales of shares provoked an ever growing emotional 
and psychological mayhem and resulted in an avalanche of sellmg bids. For example, 
about 300 m shares changed hands at the NYSE during entire January, 1987, while 
there were 604m shares on offer on the day of crash, October 19, and this number 
imcreased to 608m on October 20. 

The opening price of an AT&T share on the day of crash was $30 and the closing 
price was $232, so that the corporation lost 21.2% of its market value. 

On the whole, the DJIA was 22.6% lower on October 19, 1987, than on December 
31, 1986, which means $500 bn in absolute figures. 

During another well-known October crash, the one of 1929, the turnover of the 
NYSE was 12.9m shares on October 24 (before the crash) and 16.4 m on the day 
of crash. Accordingly, DJIA on October 29 was 12.8% lower than on December 31, 
1928, which comprised $14 bn in absolute figures. 

We supplement Figs. 3-5, m which one clearly sees the vacillations of the DJIA 
and the S&P500 Index over a period of several years, by Fig. 6 depicting the be- 
havior of the DEM/USD cross rate during one business day (namely, Thursday, 
October 19, 1993). 
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The first attempt towards a mathematical description of the evolution of stock 
prices S = (5:)¢>0 (on the Paris market) on the basis of probabilistic concepts 
was made by Louis Bachelier (11.03.1870- 28.04.1946) in his thesis “Théorie de la 
spéculation” [12] published in Annales Scientifiques de l’École Normale Supéricure, 
vol. 17, 1900, pp. 21-86, where he proposed to regard S = (St)t>0 as a random 
(stochastic) process. 

Analyzing the ‘experimental data’ on the prices si), t = 0,A,2A,... (regis- 

) 


tered at time intervals A) he observed that the differences sf) gS) had averages 
zero (in the statistical sense) and fluctuations js() - sel of order VA. 


The same properties has, ¢.g., a random walk sg), t= 0,A,2A,..., with 


(= 9+ D e, 


k<(4] 


where the g5 are identically distributed random variables taking two values, 
+o VA, with probability 5 

Passing to the limit as A — 0 (in the corresponding probabilistic sense) we 
arrive at the random process 


St = So +o Wr, t>0, 


where W = (W:)ts0 is just a Brownian motion introduced by Bachelier (or a 
Wiener process, as it is called after N. Wiener who developed [476] in 1923 a rigorous 
mathematical theory of this motion; see also Chapter III, § 3a). 

Starting from a Brownian motion, Bachelier derived the formula Cy = Ef for 
the expectation (here fr = (Sr — K)*), which from the modern viewpoint gives 
one (under the assumption that the bank interest rate r is equal to zero and Bo = 1) 
the value of the reasonable (fair) price (premium) that a buyer of a standard call 
option must pay to its writer who undertakes to sell stock at the maturity date T 
at the strike (exercise) price K (see § 1c below). (If Sp > K, then the option buyer 
gets the profit equal to (Sr — K)— Cr because he can buy stock at the price K and 
sell it promptly at the higher price Sy, while if Sp < K, then the buyer merely does 
not. show the option for exercise and his losses are equal to the paid premium Cr.) 

Bachelier’s formula (see Chapter VIII, § 1a) 


Cr = (S~ K)8( 254) + ovTo( TF), (7) 


where 


oe) = ae, aa f ody 
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was in effect a forerunner of the famous Black-Scholes formula for the rational price 
of a standard call option in the case where S = (S+) can be described by a geometric 
(or economic) Brownian motion 


ae Sge Wet (u- 2? /2)t (8) 


where W = (Wr)t>o is a usual Brownian motion. 

Under the assumption that the bank interest rate r is equal to zero and Bọ = 1, 
the Black-Scholes formula gives one the following formula for the rational price 
Cr = E(Sp — K)* of a standard call option: 


Cr = So(d4) — K®(d_), (9) 


where 


d4 = fin $ + FI |ovt vT. (10) 


(As regards the general Black-Scholes formula for r 4 0 and Bo > 0, see Chap- 
ter VIII, § 1b). 
It is worth noting that by (7), if K = So, then 


T 
Cr =o on , 
which gives one an idea of the increase of the rational option price with maturity 
time T. 

The problem of an adequate description of the dynamics of various financial 
indicators S = (St)t>0 of stock price kind, is far from being exhausted, and it is the 
subject of many studies in probability theory and statistics (we concentrate on these 
issues in Chapters II, HI, and IV). We have already explained (see subsection 4) 
that a similar (but maybe even more complicated) problem arises in connection 
with the description of the stochastic evolution of prices in bond markets P = 
{P(t,T) }o<ecr, which are regarded as random processes with fixed conditions at 
the ‘right-hand end-point’ (i.e., for t = T). (We discuss these questions below, in 
Chapter III.) 


§1c. Market of Derivatives. Financial Instruments 


1. Asharp rise in the interest to securities markets throughout the world goes back 
to the early 1970s. It would be appropriate to try to understand the course of events 
that gave impetus to shifts in economy and, in particular, securities markets. 

In the 60s, the financial markets-— both the (capital) markets of long-term secu- 
rities and the (money) markets of short-term securities—were featured by extremely 
low volatility, the interest rates were very stable, and the exchange rates fixed. From 
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1934 to 1971, the USA were adhering to the policy of buying and selling gold at a 
fixed price of $35 per ounce (= 28.25 grams). The USA dollar was considered to 
be a gold equivalent, ‘as good as gold’. Thus, the price of gold was imposed from 
outside, not, determined by market forces. 

This market situation restricted investors’ initiative and hindered the develop- 
ment of new technology in finance. 

On the other hand, several events in the 1960-70s induced considerable structural 
changes and the growth of volatility on financial markets. We indicate here the most 
important of these events (see also § 1b.2). 

1) The transition from the policy of fixed cross rates (pursned by several groups 
of countries) to rates freely ‘floating’ (within some ‘band’), spurred by the acute fi- 
nancial and currency crisis of 1973, which affected the American dollar, the German 
mark, and the Japanese yen. This transition has set, in particular, an important 
and interesting problem of the optimal timing and the magnitude of interventions 
by a central bank. 

2) The devaluation of the dollar (against gold): in 1971, Nixon’s administration 
gave up the policy of fixing the price of gold at $35 per ounce, and gold shot up: 
its price was $570 per ounce in 1980, fell to $308 in 1984, and is fluctuating mainly 
in the interval $ 300-$ 400 since then. 

3) The global oil crisis provoked by the policy of the OPEC, which came forward 
as a major price-maker in the oil market. 

4) A decline in stock trade. (The decline in the USA at this time was larger in 
real terms than during the Great Depression of the 1930s.) 


At that point the old ‘rule of thumb’ and simple regression models became 
absolutely inadequate to the state of economy and finances. 

And indeed, the markets responded promptly to new opportunities opened be- 
fore investors, who gained much more space for speculations. Option and bond 
futures exchanges sprang up in many places. The first specialist exchange for trade 
in standard option contracts, the Chicago Board Option Exchange (CBOE), was 
opened in 1973. This is how the investors responded to the new, more promising 
opportunities: 911 contracts on call options (on stock of 16 firms) were sold the 
opening day, April 26. A year later, the daily turnover reached more than 20000 
contracts, it grew to 100000 three years later, and to 700000 in 1987. Bearing in 
mind that one contract ineans a package of 100 shares, we see that the turnover in 
1987 was 70m shares each day; see [35]. 

The same year, 1987, the daily turnover at the NYSE (New York Stock Ex- 
change) was 190 m shares. 

1973 was special not only because the first proper exchange for standard option 
contracts was opened. Two papers published that year, The pricing of options and 
corporate liabilities by F. Black and M. Scholes ([44]) and The theory.of rational 
option pricing by R.C. Merton ([357]), brought a genuine revolution in pricing 
methods. It would be difficult to name other theoretical works in the finances liter- 
ature that could match these two in the speed with which they found applications 
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in the practice and became a source of inspiration for multiple studies of more 
complex options and other types of derivatives. 


2. The most prominent among the derivatives considered in financial engineering 
are option and futures contracts. It is common knowledge that both are highly 
risky; nevertheless they (and their various combinations) are used successfully not 
merely to draw (speculative) profits but also as a protection against drastic changes 
in prices. 

An Option is a security (contract) issued by a firm, a bank, another financial 

company and giving its buyer the right to buy or sell something of value 
(a share, a bond, currency, etc.) on specified terms at a fixed instant or 
during a certain period of time in the future. 

By contrast with option contracts giving a right to buy or sell, 

a Futures contract is a commitment to buy or sell something of value (e.g., 

gold, cereals, foreign currency) at a preset instant of time in the future at 
a (futures) price fixed at the moment of signing the deal. (One can find an 
indispensable source of statistical data on all securities, as well as options 
and futures, in an American financial weekly “Barron’s”.) 

a) Futures are of practical interest both to sellers and buyers of various goods. 

For example, a clever farmer worried about selling the future crop at a ‘good’ 
price and afraid of a drastic downturn in prices would prefer to make a ‘favorable’ 
agreement with a miller (baker) on the delivery of (not yet grown) grain instead of 
waiting the grain to ripe and selling it at the (who knows what!) market price of 
the day. 

Accordingly, the miller (baker) is also interested in the purchase of grain at a 
‘suitable’ price and is seeking to forestall large rises in grain prices possible in the 
future. 

In the end, both share the same objective of minimizing the risks due to the 
uncertainty of the future market prices.Thus, a futures contract is a form of an 
agreement that can in a way be convenient for both sides. 

Before a substantial discussion of futures contracts it seems appropriate to con- 
sider a related form of agreement: so-called forward contracts. 

@) As in the case of futures contracts, 

a Forward contract is also an agreement to deliver (or buy) something in the 

future at a specified (forward) price. 

The difference between futures and forward contracts is as follows: while for- 
wards are usually sold without intermediaries, futures are traded at a specialized 
exchange, where the seller and the buyer do not necessarily know each other and 
where a special-purpose settlement system makes a subsequent renouncement of 
the contract uneconomical. i 

Usually, the person willing to buy is said to ‘hold a long position’, while the one 
undertaking the delivery in question is holding a ‘short position’. 

A cardinal question of forward and futures contracts is that of the preset ‘for- 
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ward’ (‘futures’) price, which, in effect, can turn out distinct from the market price 
at the time of delivery. 
Roughly, transactions involving forwards run as follows: 


money 


seller buyer 


goods 


Here we understand ‘goods’ in a broad sense. For example, this can be currency. 
If you are interested in trading, say, US dollars for Swiss franks, then the quotations 
that you find may look as follows: 


Current price 1 USD = 1.20 CHF, 
(i.e., we can buy 1.20 CHF for $1); 
30-day forward 1 USD = 1.19 CHF; 
90-day forward 1 USD = 1.18 CHF; 
180-day forward 1 USD = 1.17 CHF. 


This picture is typical in that you tend to get less for $ 1 with the increase of delivery 
time, and therefore if you need CHF 10000 in 6 months’ time, then you must pay 


10 000 


= $8547. 
117 as 


However, if you need CHF 10000 now, then it costs you 


10 000 


L20 ` $ 8333. 


Clearly, the actual, market price of CHF 10000 can in 6 months be different from 
$8547. It can be less or more, depending on the CHF/USD cross rate 6 months 
later. 

(On the background of futures. According to N. Apostolu’s “Keys to investing 
in options and futures” (Barron’s Educational Series, 1991), “the first organized 
commodity exchange in the United States was the Chicago Board of Trade (CBOT), 
founded in Chicago in 1848. This exchange was originally intended as a central 
market for the conduct of cash grain business, and it was not until 1865 that the 
first futures transaction was performed there. Today, the Chicago Board of Trade, 
with nearly half of all contracts traded in the United States, is the largest futures 
exchange in the world. ; 

Financial futures were not introduced until the 1970s. In 1976, the Interna- 
tional Monetary Market (IMM), a subsidiary of the Chicago Mercantile Exchange 
(CME), began the 90-day Treasury bill futures contract. The following year, the 
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CBOT initiated the Treasury bond futures contract. In 1981, the IMM created the 
Eurodollar futures contract. 

A major financial futures milestone was reached in 1982, when the Kansas City 
Board of Trade introduced a stock index futures contract based upon the Value Line 
Stock Index. This offering was followed in short order by the introduction of the 
S&P500 Index futures contract on the Chicago Mercantile Exchange and the New 
York Stock Exchange Composite Index traded on the New York Futures Exchange. 
All of these contracts provide for cash delivery rather than delivery of securities. 

Trading volume on the futures exchanges has surged in the last three decades— 
from 3.9 million contracts in 1960 to 267.4 million contracts in 1989. In their short 
history, financial futures have become the dominant factor in the futures markets. 
In 1989, 60 percent of all futures contracts traded were financial futures. By far 
the most actively traded futures contract is the Treasury bond contract traded on 
the CBOT.”) 

y) A forward contract, as already mentioned, is an agreement, between two sides 
and, in principle, there exists a risk of its potential violation. More than that, it is 
often difficult to find a supplier of the goods you need or, conversely, an interested 
buyer. 

Apparently, this is what has brought into being futures contracts traded at 
exchanges equipped with special settlement mechanisms, which in general can be 
described as follows. 

Imagine that on March 1, you instruct your broker to buy some amount of wheat. 
by, say, October 1 (and specify a prospective price). The broker passes your request 
to a produce exchange, which forwards it to a trader. The trader looks for a suitable 
price and, if successful, informs the potential sellers of his wish to buy a contract 
at this price. If another trader agrees, then the bargain is made. Otherwise the 
trader informs the broker and the latter informs you that there are goods at higher 
prices, and you must take one or another decision. 

Assume that, finally, the price of the contract is agreed upon and yon, the buyer, 
keep the long position, while the seller keeps the short position. Now, to make the 
contract effective each side must put certain amount, called the initial margin, 
into a special exchange account (this is usually 2%-10% of ®g depending on the 
customers’ history). Next, the settlement mechanism comes into force. Settlements 
are usually carricd out at the end of each day and run as follows. (Here ®o is the 
(futures) price, i.c., both sides agree that the wheat will be delivered on October 1, 
at the price ®o.) 

Assume that at time ¢ = 0 the price ®g = $10000 is the (futures) price of wheat 
delivery at the delivery time T = 3, i.e., in three days. Assume also that at the end 
of the next day (t = 1) the market futures price of the delivery of wheat at the time 
T = 3 becomes $9900. In that case the clearing house at the exchange transfers 
$ 100 (= 10000 — 9900) into the supplier’s (farmer’s) account. Thus, the farmer 
earns some profit and is now in effect left with a futures contract worth $9900 in 
place of $ 10000. 
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We note that if the delivery were carried out at the end of the first day, at the 
new futures price $9900, then the total revenues of the farmer would be precisely 
the futures price By because $100 + $9900 = $ 10000 = Bo. 

Of course, the clearing house writes these $100 off the buyer’s account, and 
he must additionally pay $100 into the margin account. (Sometimes, additional 
payments are necessary only if the margin account falls below a certain level—the 
maintenance margin). 

Assume that the same occurs at time ¢ = 2 (see Fig. 7). Then the farmer takes 
$200 onto his account and the same sum is written off the buyer’s account. 


10100 


So 10 000 


9900 


9 800 


0 1 2 T=3 t 
FIGURE 7. Evolution of a futures price 


However, if the futures price of (instant) delivery (= the market price) becomes 
$10 100 at time ¢ = 3, then one must write $300(= 10100 — 9 800) off the farmer’s 
account, and therefore, all in all, he loses $ 100 (= 300 — 200), while the buyer gains 
$ 100. 

Note that if the buyer decides to buy wheat at time T = 3 on condition of 
instant delivery, then he must pay $10100 (= the instantaneous market price). 
However, bearing in mind that $100 have already been transferred to his account, 
the delivery actually costs him 10 100 — 100 = $10 000, i.e., it is precisely the same 
as the contract price Bg. 

The same holds for the farmer: the money actually obtained for the delivery of 
wheat is precisely $ 10000, because he is paid the market price $ 10100 at time T = 
3, which, combined with $100 written off his account, makes up precisely 10 000$. 

This example clearly depicts the role of the clearing house as a ‘watchdog’ keep- 
ing track of all the transactions and the state of the margin account, which is all 
very essential for smooth execution of contracts. 

We have already noted that one of the cardinal problems of the theory of forward 
and futures contracts is to determine the ‘fair’ values of their prices. 
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We show below, in Chapter VI, § 1e, how the arguments based on the ‘absence 
of arbitrage’, combined with the martingale methods, enable one to derive formulas 
for the forward and futures prices of contracts with delivery time T sold at time 
t<T. 

ô) Options. The theory and practice of option contracts has its own concepts 
and special vocabulary, and it is reasonable to become acquainted with them al- 
ready at this early stage. This is all the more important as a large portion of the 
mathematical analysis of derivatives in what follows is related to options. This has 
several reasons. 

First, the mathematical theory of options is the most developed one, and this 
example is convenient for a study of the main principles of derivatives transactions 
and, in particular, pricing and hedging (i.e., protecting) strategies. 

Second, the actual number of options traded in the market runs into millions, 
therefore, there exists an impressive statistics, which is essential for the control of 
the quality of various probabilistic models of the evolution of option prices. 

Although options are long known as financial instruments (see, e.g., the book 
[346]), L. Bachelier [12] must be the first who, in 1900, gave a rigorous mathematical 
analysis of option prices and provided arguments in favor of investment in options. 
Moreover, as already noted, trade in options became institutionalized not so long 
ago, in 1973 (see § Ic). 

(On the background of options. According to N. Apostolu’s “Keys to investing in 
options and futures” (Barron’s Educational Series, 1991), “since the creation of the 
Chicago Board Options Exchange (CBOE) in 1973, trading volume in stock options 
has grown remarkably. The listed option has become a practical investment vehicle 
for institutions and individuals seeking financial profit or protection. The CBOE 
is the world’s largest options marketplace and the nation’s second-largest securities 
exchange. Options are also traded on the American Stock Exchange (AMEX), the 
New York Stock Exchange (NYSE), the Pacific Stock Exchange (PSE), and the 
Philadelphia Stock Exchange (PHLX). Options are not limited to common stock, 
They are written on bonds, currencies, and various indexes. The CBOE trades 
options on listed and over-the-counter stocks, on Standard & Poor’s 100 and 500 
market indexes, on U.S. Treasury bonds and notes, on long-term and short-term 
interest rates, and on seven different foreign currencies.” ) 

£) For definiteness, we assume in our discussion of options that transactions can 
occur at time 


n=0,1,...,N 


and all the trade stops after the instant N. 
Assume also that we discuss options based on stock of value described by the 
random sequence 


S = (Sn)ogn<n- 
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Using the standard vocabulary, we distinguish options of two kinds: 


e buyer’s options (call options) and 
e seller’s options (put options). 


A call option gives one the right to buy. 

A put option gives one the right to sell. 

From the standpoint of financial engineering it is important that these two 
financial instruments ‘work in opposite directions’: as the gains from one of them 
increase, the gains from the other drop. This explains the widely used practice of 
diversification, of operating options in different classes, sometimes combined with 
other securities. 

As regards their exercise, options fall into two types, European and American. 

If an option can be shown for exercise only at some fired time N, then we call N 
its maturity time (option’s expiration date) and our option belongs to the European 
type. 

On the other hand, if the option can be shown for exercise at an arbitrary (e.g., 
random) instant r < N, then we call it an option of American type. 

In practice, most options are of American type. This gives the buyer more 
freedom, allowing him to choose the exercise time T. Note that these two types of 
options are equivalent in certain cases (in the sense that the optimal exercise time 
T of an American-type option is equal to N; see Chapter VI, § 5b and Chapter VIII, 
§ 3c below for detail). 

We now consider for definiteness a standard call option of European type with 
maturity time N. It can be characterized by the (strike or exercise) price K (preset 
at the instant of writing) at which the buyer will be able to buy, say, shares, whose 
market price Sy at time N may be distinct (sometimes, considerably distinct) 
from K. 

If Sy > K, then the situation is favorable for the buyer because, by the contract 
terms, he has the right to buy shares at the price K. Once bought, they can be 
immediately sold at the market price Sy. In this case his gains from this operation 
will be equal to Sy — K. 

On the other hand, if Sy turns out to be less than K, then the buyer’s right of 
purchase (at the price A’) is of no value because he will be able to buy shares at a 
lower price (Sy). 

All combined, we can say that the buyer’s gains fy at time N can be expressed 
by the formula 


in = (Sy = 10", 


where at = max(a, 0). 
Of course, one must pay certain premium Cy for the acquisition of this financial 
instrument, so that the net profit of the buyer of the call option is 
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(Sy-—K)—Cy for Sy >kK, 
-Cy for Sy <kK. 


Accordingly, the writer’s gain is 
Cy —(Sy-—K) for Syn>K 
and 


Cy for Sy <K. 


Hence it is clear that purchasing a call option one anticipates a rise of stock 
prices. (Note that, of course, the option premium Cy depends not only on N, but 
also on K, and, clearly, the less is K the larger must be Cy.) 


Remark. There exist special names for those acting on the assumption of a rise 
or fall of some article. The dealers expecting prices to go up are called bulls, A 
bull opens a long position expecting to sell with profit afterwards, when the market 
goes up. Those dealers who expect the market to move downwards are called bears. 
A bear tends to sell securities he has (or even has not—which is referred to as ‘short 
selling’). He hopes to close his short position by buying the traded items afterwards, 
at lower prices. The difference between the current price and the purchase price in 
the future will be his premium. 

Depending on the relation between the market price Sọ at t = 0 and K, options 
can be divided into three classes: options bringing a gain (in-the-money), ones with 
gain zero (at-the-money), and options bringing losses (out-of-money). In the case 
of a call option these classes correspond to the relations So > K, So = K, and 
So < K, respectively. 

We must point out straight away that there is an enormous difference between 
the positions of a buyer and a seller. 

The bnycr purchasing the option can simply wait till the maturity date N, 
watching: -if he likes - the dynamics of the prices Sn, n > 0. 

The position of the option writer is much more complicated because he must 
bear in mind his obligation to meet the terms of the contract, which requires him 
to not merely contemplate the changes in the prices Sp, n > 0, but to use all 
financial ineans available to him to build a portfolio of securities that ensures the 
final payment of (Sy — K)*. 

The following two questions are central here: what is the ‘fair’ price Cy of 
buying or selling an option and what must a seller do to carry out the contract. 

In the case of a standard put option of European type with maturity date N the 
price K at which the option buyer is entitled to sell stock (at time N) is fixed. 
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Hence if the real price of the stock at time N is Sy and Sy < K, then selling it 
at the price K brings K — Sy in revenues. His net profit, taking into account the 
premium Py for the purchase of the option, say, is equal to 


(K — Sn) — Py. 


On the other hand, if Sy > K, ìe., the preset price is less than the market 
price, then there is no sense in showing the option for exercise. 

Hence the net profit of the buyer of a put option is (K —Sy)* — Py. 

As in the case of a call option, here we can also ask about a ‘fair’ price Py 
suitable for both writer and buyer. 

n) As an illustration we consider an example of a call option. 

Assume that we buy 10 contracts on stock. As a rule, each contract involves 
100 shares, so we discuss a purchase of 1000 shares. Assume that the market price 
So of a share is 30 (American dollars, say), K = 35, N = 2, and the premium for 
the total of 10 contracts is ($) 250. 

Farther, let the market price S2 at time n = 2 be $40. In this case the option 
is shown for exercise and the corresponding net profit is positive: 


(40 — 35) x 1000 — 250 = ($) 4750. 


However, if the market price S2 is $35.1, then we again show the option for exercise 
(since Sy > K = $35), but the corresponding net profit is now negative: 


(35.1 — 35) x 1000 — 250 = ($) — 150. 
It is clear that our profit is zero if 
(Sp — K)+ 1000 = ($) 250, 


Le, S2 = $35.25 (because K = $35). 

Thus, cach time the share price S2 drops below $35.25, the buyer of the call 
option takes losses. 

Assume now that we consider an American-type call option, which gives us the 
right to exercise it at time T = 1 or T = 2. Imagine that the share price rises 
sharply at the instant T = 1, so that Sı = $50. Then the buyer of the call option 
can show it for exercise at this instant and pocket a huge net profit of 


(50 — 35) - 1000 — 250 = 15 000 — 250 = ($) 14 750. 


It is however clear that the premium for such an option must be considerably larger 
than $250 because ‘greater opportunities should come more expensive’. The actual 
prices of American-type options are indeed higher than those of European ones. 
(Sec Chapter VI below.) 
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We have considered the above example from the buyer’s standpoint. We now 
turn to the writer’s position. 

In principle, he has two options: to sell stock that he has already (‘writing 
covered stock’) or to sell stock he has not at the moment (‘writing naked call’). 

The latter is very risky and can be literally ruinous: if the option is indeed 
shown for exercise (provided Sy > K), then, to meet the terms of the contract, the 
seller must actually buy stock and sell it to the buyer at the price K. 

If, e.g., S2 = $40, then he must pay $ 40000 for 1000 shares. 

His premium was only $250, therefore, his total losses are 


40.000 — 35 000 — 250 = ($) 4750. 


It must be noted that the writer’s net profit is in both cases at most $ 250. This 

is a purely speculative profit made ‘from swings of stock prices’ and (in the case of 
‘naked stocks writing’) in a rather risky way. We can say that here the ‘writer’s 
profit is his risk premium’, 
3. In practice, ‘large’ investors with big financial potentials reduce their risks by an 
extensive use of diversification, hedging, investing funds in most various securities 
(stock, bonds, options, ...), commodities, and so on. Very interesting and instruc- 
tive in this respect is G. Soros’s book [451], where he repeatedly describes (see, for 
instance, the tables in §§ 11, 13 of [451]) the day-to-day dynamics (in 1968-1993) 
of the securities portfolio of his Quantum Fund (which contains various kinds of 
financial assets). For example, on August 4, 1968, this portfolio included stock, 
bonds, and various commodities ([451, p. 243]). 

From the standpoint of financial engineering Soros wrote a masterpiece of a 
handbook for those willing to be active players in the securities market. 


4. In our considerations of European call options, we have already seen that they 
can be characterized by 


1) their maturity date N, 
2) the pay-off function fy. 


e For a standard call option we have 
fn =(Snw —K)*. 

e For a standard call option with aftereffect we have 
fn = (Sn — Kn)", 


where Ky = a+ min(So, $1,...,S5~); @ = const. 
e For an arithmetic Asian call option we have 


N 
= = 1 
= pak a Mest as 
fn =(Sn—K)", SN a 
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We must note that the quantity K entering, for instance, the pay-off function 
fn = (Sy — K)t of a standard call option, its strike price, is usually close to Sp. 
As a rule, one never writes options with large disparity between So and K. 

Pay-off functions for put options are as follows: 


e for a standard put option we have 
fn =(K - Sy); 
e for a standard put option with aftereffect we have 
fn = (Kn - SN)”, Ky =a- max(So, S1,- .., SN); @ = const 


e for an arithmetic Asian put option we have 


N 
se 1 
fn =(K—-Sw)t, Sy = Wer 2 Se 


There exist many types of options, some of which have rather exotic names (see, 
for instance, [414] and below, Chapter VIII, § 4a). We also discuss several types of 
option-based strategies (combinations, spreads, etc.) in Chapter VI, § 4e. 

One attraction of options for a buyer is that they are not very expensive, al- 
though the commission can be considerable. To give an idea of the calculation of 
the price of an option (the premium for the option, the non-reimbursable payment 
for its purchase), we consider the following, slightly idealized, situation 

Assume that the stock price Sn, 0 <n < N, satisfies the relation 


Sn = Sot (1 +--+ En), 


where Sg > N is an integer and (£k) is a sequence of independent identically 
distributed random variables with distribution 


PG, = 41) = 5. 


Then, of course, Sn > 0,0<n IN. 

Assume also that we have at our disposal a bank account containing an amount 
B = (Bn)ogngn, Where Bn = 1 (i.e., the interest rate r = 0 and Bo = 1). We 
consider now a standard European call option with pay-off fy = (Sy — K)*. 

We claim that the rational (or fair, mutually appropriate) price Cy of such an 
option can be expressed as follows: 


Cy = E(Sy —K)t, 


ìe., the size of the premium is equal to the average gain of the buyer. 
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We enlarge on the formal definition of Cy and the methods of its calculation be- 
low (see Chapter VI), while here we can substantiate the formula Cy = E(Sj—K)t 
as follows. 

Assume that the writer’s ask price Cy is larger than E(Sj — K)" and the buyer 
agrees to purchase the option at this price. We claim that the writer has a riskless 
profit of Cn — Cy in this case. 

In fact, the buyer acknowledges that the price must give the writer a possibility 
to meet the terms of the contract. It is clear that this price cannot be too low. 
But, understandably, the buyer also would not overpay: he would rather buy at the 
lowest price enabling the writer to meet the contract terms. 

In other words, we must show that the option writer can use the premium 
Cy = E(Sxy — K)t to meet the contract terms. For simplicity, we set N = 1 and 
K = So. Then Cı = Et = Z. We now describe the ways in which the writer can 
operate with this premium in the securities market. 

We represent 4 as follows: 


Taa 


F rea (= Bo- 1+7: So). 

Setting Xo = o -1 +70- So (= 5) we can call this premium 5 the (initial) 
capital of the writer; a part of it (£) is put into a bank account and another part 
is invested in (yo) shares. The fact that o is negative in our case means that the 
seller merely overdraws his account, which, of course, must be repaid. 

The pair (0, Y0) forms the so-called writer’s investment portfolio at time n = 0, 

What is this portfolio worth at the time n = 1? Denoting this amount by X 
we obtain 


Xı = o- Bi + Yo- 51 = o + y0(S0 + 41) 


_ fl So 1 —1+& 
6 7) + 9(S0 +) = 2 
ae for €; = 1, 
~ L0 for & = 1 
Since é 
1+& 
+ 
ef =, 


it is obvious that 


Xief (=(%-K)*). 


In other terms, the portfolio (69,7) ensures that the amount Xj is precisely 
equal to the pay-off f1, which enables the writer to meet the terms of the contract 
and repay his debt. 
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For if €; = 1, then he gets 5 (So +1) for the stock. Since 


horners($-3) 


S 1 
this money is sufficient to repay the debt (2 — z) and pay the buyer an amount 


2 
of (Sı — K)t = (S1 — So)t =éf =1. 

On the other hand, if €£; = —1, then he obtains $(So—1) for the stock. The buyer 
does not exercise the option in this case (because S1 = So +€ = So—1< So = K), 
and therefore the writer must merely repay his debt $(So — 1), which is precisely 
equal to the money obtained for the stock (we now have €; = ~1). 

Hence, if the writer sets Cy > Cı = E(S; — K)*, then upon meeting all the 
terms of the contract he gets a riskless profit of Ci. — Čik 

We now claim that if the premium C; is smaller than Cy (= 4), then the writer 
cannot meet the terms (without losses). 

Indeed, choosing a portfolio (69,70) we obtain 


Xo = Bo + yo So 


and 


X1 = Bo + ¥0(S0 + £1) = Xo + voĉ. 


If €; = 1,,then, under the terms of the contract, the writer must pay an amount 
of 1 to the buyer and, additionally, pays — ĝo, i.e., he must obtain 


yo(So +1) =1— fo 
for the stock, while if €; = —1, then he must get 
yo(So — 1) = -o 


for the stock. 
All in all, we must set 


So 
a, 


And yet, for these values of the parameters we have Bo + yo5So9 = 4, therefore the 
equality Xo = o + Yo So is impossible for Xo < L, 
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Thus, we have established the formula Cy = E(Sy — K)* for N = 1 and 
So = K. One can prove it in the general case by similar arguments. We are not 
doing that here because we shall deduce the same formula below (in Chapter VI) 
from general considerations. 

Note that if So = K, then 


Cy =E(& + +ên)", 


IN 
Cn ~4/ = 
N 20 
by the Central limit theorem. 


Hence the option premium increases as VN with time N. This is perfectly 
consistent with the result following (in the case of Sọ = A’) from Bachelier’s formula 
for Cr at the end of § 1b: 


and 


(for r = 0, Bo = 1, o = 1, and So = K). 


5. We have already mentioned that there is a great variety of option kinds on the 
market. 

Options on indexes are quite common; for instance, options on the S&P100 and 
the S&P500 Indexes traded on the CBOE. (Options on S&P100 are of American 
type, while ones on S&P500 are of European type.) In view of the large volatility, 
the maturity time of these options is fairly short. See also Chapter VIII, § 4a. 

In the case of options on futures the role of the prices (Sn) is played by the 
futures prices (®,,), whereas in the case of options on some index it is the value of 
this index that plays the same role of the prices (St). 


2. Financial Markets under Uncertainty. 
Classical Theories of the Dynamics of Financial Indexes, 
their Critics and Revision. Neoclassical Theories 


Here are just several questions that arise naturally once one has got acquainted 
with the theory and practice of financial markets: 


e how do financial markets operate under uncertainty? 

e how are the prices set and how can their dynamics in time be described? 
e what concepts and theories must one use in calculations? 

e can the future development of the prices be predicted? 

e what are the risks of the use of some or other financial instruments? 


In our descriptions of the price dynamics and in pricing derivative financial 
instruments we shall take the standpoint of a market without opportunities for ar- 
bitrage. Mathematically, this transparent economic assumption means that there 
exists a so-called ‘martingale’ (risk-neutral) probability measure such that the (dis- 
counted) prices are martingales with respect to it. This, in turn, gives one a pos- 
sibility to use the well-developed machinery of stochastic calculus for the study of 
the evolution of prices and for various calculations. 

Although we do not intend to give here a detailed account of the existing theories 
and concepts relating to financial markets, we shall discuss those that are closer to 
our approach, where the main (probabilistic) stress is put on the applications of sto- 
chastic calculus and statistics in financial mathematics and engineering. (We point 
out several handbooks and monographs: [79], [83], [112], [117], [151], [240], [268], 
[284], [332]-[334], [387], [460], where one can find the discussion of most diverse as- 
pects of financial markets, economical theories and concepts, theories designed for 
the conditions of certainty or uncertainty, equilibrium models, optimality, utility, 
portfolios, risks, financial decisions, dividends, derivative securities, etc.)- 

We note briefly that back in the 1920s, considered to be the birth time of finan- 
cial theory, its central interests were mainly concentrated around the problems of 
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managing and raising funds, while its ‘advanced mathematics’ was virtually reduced 
to the calculation of compound interest. 

The further development proceeded along two lines: under the assumption of a 
complete certainty (as regards the prices, the demand, etc.) and under the assump- 
tion of uncertainty. 

Decisive for the first approach were the results of I. Fisher [159] and F. Modigliani 
and M. Miller [356]. [350] who considered the issue of optimal solutions that must be 
taken by individual investors and corporations, respectively. Mathematically, this 
reduces to the problem of maximization under constraints for functions of several 
variables. 

In the second field, we must first of all single out H. Markowitz [268] (1952) and 
M. Kendall [269] (1953). 

Markowitz’s paper, which established a basis of investment portfolio theory, 
concentrated on the optimization of investment decisions under uncertinity. The 
corresponding probabilistic method, the so-called mean-variance analysis, revealed 
a central role of the covariance of prices, which is a key ingredient determining the 
(unsystematic) risks of the investment portfolio in question. It was after that paper 
that one became fully aware of the importance of diversification (in making up a 
portfolio) to the reduction of unsystematic risks. This idea influenced later two 
classical theories developed by W. Sharpe [433] (1964) and S. Ross, [412] (1976), 


e CAPM— Capital Asset Pricing Model [433] and 
e APT-—Arbitrage Pricing Theory [412], 


These theories describe the components making up the yield from securities (for 
example, stock), explain their dependence on the state of the ‘global market’ 
of this kind of securities (CAPM-theory) and the factors influencing this yield 
(APT-theory), and describe the concepts that must underlie financial calculations. 
We discuss the central principles of Markowitz’s theory, CAPM, and APT below, 
in §§ 2b--2d. 

It is already clear from this brief exposition that all the three theories are related 
to the reduction of risks in financial markets. 

In the discussion of risks,° it should be noted that financial theory usually dis- 
tinguishes two types of them: 


e unsystematic risks, which can be reduced by a diversification (they are also 
called diversifiable, residual, specific, idiosyncratic risks, etc.), i.e., risks that 
investor can influence by his actions, and 

e systematic, or market risks in the proper sense of the term (undiversifiable 
risks). 

An example of a systematic risk is the one related to the stochastic nature of inter- 
est rates or stock indexes, which a (‘small’) investor cannot change on his own. This 
does not imply, of course, that one cannot withstand such risks. It is in effect to 


“All sorts of risks, including insurance risks (see § § 3a, 3b below). 
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control possible systematic risks, to work out recommendations for rational finan- 
cial solutions, to shelter from big and catastrophic risks (e.g., in insurance) that one 
develops (fairly complicated) systems for the collection and processing of statistical 
data, and for the prediction of the development of market prices. This is in fact the 
raison d'être of derivative financial instruments, such as futures contracts, options, 
combinations, spreads, etc. This is the purpose of hedging, more complicated tech- 
niques than diversification, which were developed for these instruments, take into 
account random changes of the future prices. and aim at the reduction of the risks 
of possible unfavorable effects of these changes. (One can find a detailed discussion 
of hedging and the corresponding pricing theory in Chapter VI.) 

M. Kendall’s lecture [269] (1953) at a session of the Royal Statistical Society 
(London) concentrates around another question, which is more basic in a certain 
sense: how do market prices behave, what stochastic processes can be used to de- 
scribe their dynamics? The questions posed by that important work had ultimately 
brought about Efficient Capital Market Theory (the ECM-theory), which we discuss 
in the next section, and its many refinements and generalizations. 


§2a. Random Walk Conjecture and Concept of Efficient Market 


1. In several studies made in the 1930s their authors carried out empirical analysis 
of various financial characteristics in an attempt to answer the notorious ques- 
tion: is the movement of prices, values, and so on predictable? These papers were 
mostly written by statisticians; we single out among them A. Cowles [84] (1933) 
and [85] (1944), H. Working [480] (1934), and A. Cowles and H. Jones [86] (1937). 
A. Cowles considered data from stockmarkets, while H. Working discussed com- 
modity prices. 

Although one could find in these papers plenty of statistical data and also 

Sn of 
Sn—1 
the logarithms of the prices Sn, n > 1, must be independent, neither economists 
nor practitioners paid much attention to this research. 

As noted in [35; p. 93], one probable explanation can be that, first, economists 
regarded the price dynamics as a ‘sideshow’ of the economy and, second, there were 
not so many economists at that time who had a suitable background in mathematics 
and mastered statistical techniques. 


interesting -and unexpected—conclusions that the increments An = In 


As regards the practitioners, the conclusion made by these researchers that the 
sequence (Hn)n>1, where Hy = hy +-+ + hn, had the nature of a random walk 
(i.e., was the sum of independent random variables) was in contradiction with the 
prevailing idea of the time that the prices had their rythms, cycles, trends, ... , and 
one could predict price movements by revealing thern. 

There had been in fact no important research in this field since then and until 
1953, when M. Kendall published the already mentioned paper [269], which opened 
up the modern era in the research of the evolution of financial indicators. 
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The starting point of Kendall’s analysis was his intention to detect cycles in 
the behavior of the prices of stock and commodities. Analyzing factual data (the 
weekly records of the prices of 19 stocks in 1928-1938, the monthly average wheat 
prices on the Chicago market in 1883-1934, and the cotton prices at the New York 
Mercantile Exchange in 1816-1951) he, to his own surprise, could find no rhythms, 
trends, or cycles. More than that, he concluded that these series of data look as if 
“... the Demon of Chance drew a random number ... and added it to the current 
price to determine the next... price”. In other words, the logarithms of the prices 


S = (Sn) appear to behave as a random walk: setting hn = In we obtain 


n 
Sn—1 


Sn = Soe”, n>1 


> 


where Hp is the sum of independent random variables h1,..., An. 

Here it seems appropriate to recall again (cf. § 1b) that in fact the first author 
(before A. Cowles and H. Working) to put forward the idea to use a random walk 
to describe the evolution of prices was L. Bachelier {12}. He conjectured that the 
prices gS) = (sta) (not logarithms, by the way) change their values at instants 
A,2A,... so that 


A 
gh =Sotfatfat-:::+&a, 


where (€;4) are independent identically distributed random variables taking the 
values +ø VA with probability $, Hence 


ESA) = So, DSA) = 0. (kA). 


t 
We have already mentioned that, setting k = Fae t > 0, and passing formally to 


the limit L. Bachelier discovered that the limiting process S = (S:)ts0, where S = 
(A) 

[ġja | 
sense), had the following form: 


pus we must understand the limit here in a certain suitable probabilistic 
> 


St = So + o Wr, 


where W = (W) (Wo = 0, EW; = 0, EW? = t) was a process that is usually 
called now a standard Brownian motion or a Wiener process: a process with in- 
dependent Gaussian (normal) increments and continuous trajectories. (See Chap- 
ter III, §§ 3a, 3b for greater detail.) 


2. The interest to a more thorough study of the dynamics of financial indexes and 
the construction of various probabilistic models explaining the phenomena revealed 
by observations (such as, e.g., the cluster property) has significantly grown after 
M. Kendall’s paper. We single out two studies of the late 1950s: [405] by H. Roberts 
(1959) and [371] by M. F. M. Osborne (1959). 
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Roberts’s paper, which drew upon the ideas of H. Working and M. Kendall, 
was addressed directly to practitioners and contained heuristic arguments in favor 
of the random walk conjecture. “Brownian Motion in the Stock Market” by the 
astrophysicist Osborne grew up, by his own words (see [35; p. 103]), from the desire 
to test his physical and statistical techniques on such ‘earthly’ items as stock prices. 
Unacquainted with the works of L. Bachelier, H. Working, and M. Kendall as he 
was, M. F. M. Osborne came in effect to the same conclusions; he pointed out, 
however (and this proved to be important for the subsequent development), that 
these were the logarithms of the prices S; that varied in accordance with the law 
of a Brownian motion (with drift), not the prices themselves (which were the main 
point of Bachelier’s analysis). This idea was later developed by P. Samuelson [420], 
who introduced into the theory and practice of finance a geometric (or, to use his 
own term, economic) Brownian motion 

Sı = Sper Wet (no? /2)t t>0 


’ 


where W = (W+) is a standard Brownian motion. 


3. It would be exaggeration to say that the use of the random walk conjecture 
for the description of the evolution of prices was accepted by economists then and 
there. But it was this conjecture that gave rise to the concept of rational (or, as one 
often says, efficient) market, whose initial destination was to provide arguments in 
favor of the use of probabilistic concepts and, in this context, to demonstrate the 
plausibility of the random walk conjecture and of the (more general) martingale 
conjecture. 

In few words, ‘efficiency’ here means that the market responds rationally to new 
information. This implies that, on this market 


1) corrections of prices are instantaneous and the market is always in ‘equilib- 
rium’, the prices are ‘fair’ and leave the participants no room for arbitrage, 
i.e., for drawing profits from price differentials; 

2) the dealers (traders, investors, etc.) are uniform in their interpretation of 
the obtained information and correct their decisions instantaneously as new 
information becomes available; 

3) the participants are homogeneous in their goals; their actions are ‘collec- 
tively rational’. 


Incidentally, on the formal side the concept of ‘efficiency’ must be considered as 
related to and dependent upon the nature of the information flowing to the market 
and its participants. 

One usually distinguishes between three kinds of accessible data: 


1° the past values of the prices; 

2° the information of a more broad character than the prices contained in 
generally accesible sources (newspapers, bulletins, TV, etc.); 

3° all conceivable information. 
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For a suitable formalization of our concept of ‘information’ we start from the 
assumption that the ‘uncertainty’ in the market can be described (see § la in Chap- 
ter II for additional detail) as ‘randomness’ interpreted in the context of some 
probability space (Q, Z, P). As usual, here 

Q = {w} is the space of elementary outcomes 

F is some o-algebra of subsets of Q, 

P is a probability measure on (Q, F). 

It is worthwhile to endow the probability space (Q, F, P) with a flow (filtration) 
F = (Fn)nso of o-subalgebras Fn such that Fin C Fn C F form <n. 

We interpret the events in Fn as the ‘information’ accessible to an observer up 
to the instant n. 


4. Remark. In connection with the concept of an ‘event’ observable before time n, 
which is formally a subset in Fn, we emphasize the following. 

In experimenting (e¢.g., keeping record of the prices) we usually do not want 
that much to know of a specific outcome as we want to know whether this outcome 
belongs to one or another subset of all possible outcomes. 

We mean by observable events sets A C Q such that, under the conditions of our 
experiment (given the general state of the market in our case), there can be answers 
of two types, ‘the outcome w belongs to A’ or ‘the outcome w does not belong to A’. 
For instance, if w = (wj. w2,w3), where we interpret the value w; = +1 as a rise in 
prices at time i, while w; = —1 means a drop in prices at the same instant, then 
the space of all conceivable outcomes (in this three-step model) consists of eight 
points: 

Q = {(-1,-1,-1), (-1,-1,41),...,(+1,+1,+1)}. 


If we register all the values w;,w2,w3, then, for instance, the set 
= {(-1, +1, -1), (+1,-1,+1)} 


is an ‘event’ since we can say with confidence whether ‘w € A’ or ‘w ¢ A’. 
However, if we cannot record the result wg at time ? = 2, i.e., we have no 

inforination on this value, then A is not an ‘event’ any longer: knowing only w1 

and w3 we cannot answer the question whether w = (w1, w2,w3) is in the set A. 


5. In stochastic calculus one usually calls spaces (Q, ¥,F = (Fn), P) with distin- 
guished flows F = (Fn) of o-algebras filtered probability spaces. In the framework 
of financial mathematics we shall also call F = (Fn) an information flow. Using 
this concept we can describe various forms of efficient markets as follows. 

Assume that there are three flows of o-algebras 


L= (FI), ES (Z) P= (F3), 


in (Q, F, P), where F} C F2 C F3, and we interpret cach of the -algebras F$ as 
the data of the (7)th kind arriving at the instant n. 
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Following E. Fama [150] (1965) we shall say that a market is weakly efficient 
if there exists a discounting price B = (Bn)n yo (usually, this is a risk-free bank 
account) and a probability measure P locally equivalent to P (i.e., its restrictions 
P,, = P|.%! to F] are equivalent to P, = P| Fn for each n > 0) such that each 
price S = (Sn) (of a financial instrument on this market) satisfies the following 


condition: the ratio 
S (£) 
B Bn n>0 


sone . ; : Sn 
is a P-martingale, i.c., the variables == are #}-measurable and 


n 
eS co, e(S a)-e, abe 
n Bn+1 Bn 


If we have the martingale property with respect to the information flow 
F’ = (F3) then we have a strongly efficient market, while in the case of F? = (F2) 
we have a semi-strongly efficient market. 

For simplicity, we shall assume throughout this section that B, = 1 and P=P. 

Before a discussion of these definitions we point out the following relation be- 
tween the classes of martingales M! = M(E), M? = M(F?), and M? = M(F3) 
with respect to the flows F! F?, and F3: 


MÈ CMO CM. 


In fact, the inelusion S = _(Sn)n>0 E€ M (for one) means that the Sp are 
F messira ble and E(Sn+1| FZ) = Sn. Hence, by the ‘telescopic’ property of 
conditional expectations 


E(Sn4a | FL) = E(E(Sn41 | #2) | F) 


and since the Sp are ¥}-measurable (we recall that F] is generated by all prices 
up to time n, including the variables Sp, k < n), it follows that E(Sn4+1 | F1) = Sn, 
ie, S € MÌ. 

If €1, ĉ2,... is a sequence of independent random variables such that E|E,| < 00, 
EG, =0,k > 1, FÉ =o(€,...,€n), FS ={2,Q}, and F$ C Fl, then, evidently, 


the sequence S = (Sn)nz0, where Sn = 1 + +--+ €n for n > 1 and Sọ = 0, is a 
martingale with respect to FE = (FÈ )n>0 and 


E(Sn+1 | F) = Sn +E(En41 | Fi), T= 1,2,3. 


Hence the sequence S = (Sn)nzo is clearly a martingale of class AM if for each 
n the variables €,,41 are dependent of F, (in the sense that for each Borel set A 
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the event {&,41 € A} is independent of the events in F$). Thus, if we treat ĉn+1 
as ‘entirely fresh data’ relative to F3, then S is in the class 43. 

It is important for what follows that if X = (Xn) is a martingale with respect 
to the filtration F = (Fn) and Xn = z1 +: + £n with rp = 0, then x = (£n) is a 
martingale difference, i.e., 


In ìs Fn-measurable, 
Elan| < œ, 


Eln] Fn—-1)=0. 
It follows from the last property that, provided that Eļzn|? < oo for n > 1, we have 
Exntnth = 0 


for each n > 0 and k > 1, i.e., the variables x = (xp) are uncorrelated. In other 
words, square-integrable martingales belong to the class of random sequences with 
orthogonal increments: 

EAXnAXn+k = 0, 


where A Xn = Xn — Xn—1 = Ln and AXn4k = £n+k- Denoting the class of such 
sequences by OJ (here the subscript 2 corresponds to ‘square integrability’) we 
obtain 

M3 © M3 C M3 T Ole. 


It follows from the above that, in the final analysis, the efficiency of a market 
is nothing else but the martingal property of prices in it. One example is market 
where prices are ‘random walks’.4 


6. Why do we believe the conjecture of the ‘martingale property’, which generalizes 
the ‘random walk’ conjecture and is inherent in the concept of ‘efficient market’, 
to be quite a natural one? There can be several answers. Arguably, the best 
explanation can be given in the framework of the theory of arbitrage-free markets, 
which directly associates the efficiency (or more specifically, the rationality) of 
a market with the absence of opportunities for arbitrage. It will be clear from 
what follows that the latter property immediately results in the appearance of 
martingales. (See Chapter V, § 2a for fuller detail.) 

Now, to give an idea of the way in which martingales appear in this context, we 
present the following elementary arguments. 


41n the probability and statistics literature one usually considers a ‘random walk’ to 
be a walk that can be described by the sum of independent random variables. On the 
other hand, the economists sometimes use this term in another sense, to emphasize merely 
the random nature of, say, price movement. 
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Let S = (Sn)nz1, where Sn is the price of, say, one share at the instant n. Let 


AS. 
Pn = a ) n2i, 
(here ASn = Sn — Sn—1) be the relative change im prices (the interest rate) and 


assume that the market is organized in such a way that, with respect to the flow 
F = (Fn) of accessible data, the variables Sn are ¥,-measurable and (P-almost 
everywhere) 


E(pn | Fn-1) =f, (1) 
for some constant r. By the last two formulas, 
Sn = (1+ Pn)Sn-1 (2) 
and (assuming that 1 +r 4 0) 


= E(Sp, | Fn-1) 
Sn—1 = ee ‘ 


By assumption, 
ASn = prSn-1, nol. 


We also assume that, along with our share, we consider a bank account 
B=(Bn)nso such that 


ABn = rBn-1, n > 1, (4) 


where r is the interest rate of the account and Bo > 0. 
Since Bn = Bo(1 + r)” by (4), we can see from (3) that 


Sn-1 = e(F Fn): 


Bn-1 By 
f s S ‘ ; : 
This means precisely that the sequence (2) is a martingale with respect to 
ns ngl 


the flow F = (Fn)nz1- 

Ow above assumption E(pn | Fn-1) = r (P-a.e.) seems to be fairly natural ‘in 
an economist’s view’: otherwise (e.g., if E(pn | Fn—1) > r (P-almost everywhere) 
ot E(pn | ¥%n—1) < r (P-almost everywhere) for n > 1) the investors will find out 
promptly that it is more profitable to restrict their investment to stock (in the first 
case) or to the bank account (in the second case). To put it another way, if one 
security ‘dominates’ another, then the less valuable one will swiftly disappear, as it 
should be in a ‘well-organized’, ‘efficient’ market. 
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7. We consider now a somewhat more complicated version of our model (2) of the 
evolution of stock prices. 

Assuming that at time n — 1 you buy one share at a price S,_1 and, at time n, 
sell it (at a price Sn) your (gross) ‘profit’ (which can be either positive or negative) 
is AS, = Sn — Sp—1. It is, of course, more sensible to measure the ‘profit’ in the 


n ) (as we did it earlier), rather than in the absolute ones, A Sn, 
-1 
i.e., to compare ASh and the money Sn—ı paid for the share. 

For example, if Sn—1 = 20, while Sn = 29, then A Sn = 9, which is not all that 
little compared with 20. But if Sn—1 = 200 and Sn = 209, then the increment A Sn 
is 9 again; now however, compared with 200, this is not all that much, 

Thus, pn = 9/20 (= 45%) in the first case, while pn = 9/200 (= 4.5%) in the 
second. 

One often calls for brevity these relative profits the returns or the growth co- 
efficients (alongside the already used term ‘(random) interest rate’). We shall 
occasionally use this terminology in what follows. 

In accordance with our interpretation of the increments AS, = Sn — Sp_1 as 
the profits from buying (at time n — 1) and selling (at time n), we now assume that 
we have an additional source of revenue, e.g., dividends on stock, which we assume 
to be ¥,-measurable and equal to ôn at time n. 

Then our total ‘gross’ profits are ASn + dn, while their relative value is 


relative values ( 


ASn + ôn 
= —_—*,, 5 
Pn Sn 1 ( ) 


It would be interesting to have a notion of the possible ‘global’ pattern of the be- 
havior of the prices (Sn), provided that, ‘locally’, this behavior can be described by 
the model (5). Clearly, to answer this question we must make certain assumptions 
about (pn) and (ôn). 

With this in mind we now assume, for instance, that 


E(pn | Fn- 1)=rz0 


for all n > 1. If, in addition, E|Sn| < oo, and Eļôn| < oo, then 


1 
Sn-1 = — lar E(Sp | F n-1 ) + En | Fn- 1) (6) 


by (5), where E(- | Fn—1) is a conditional expectation. 
In a similar way, 


1 1 
Sn = Tap O | Fn) + Igp O | Fn), 
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which in view of (6), brings us to the equality 


1 1 1 
Sn-1 = aration | Fn-1) oe Gane Ont | ¥n—1) F jane | Fn-1): 
Continuing, we see that 
1 Eo a 
= T ——E (fni | F, 
Sn = prk" Snel Fn) + 2 gary til Fn) (7) 


for each & > 1 and each n > 1. 

Hence it is clear that each bounded solution (|S,| < const for n > 1) of (6) 
(for n > 1) has the following form (provided that |E(ðn+; | Fn)| < const for n > 0, 
L> Ik 

1 
Sn = 2 ggi" Orl Fa). (8) 

In the economics literature this is called the market fundamental solution (see, 
e.g., [211]). In the particular case of dividends unchanged with time (ôn = 6 = 
const) and E(pn | Zn—-1) =r > 0 it follows from (8) that the (bounded) prices Sn, 
n > 1, must also be constant: 

ô 


Sn=-, n>1. 
r 


8. The class of martingales is fairly wide. For instance, it contains the ‘random 
walk’ considered above. Further, the martingale property 


E(Xn | Fn-1) = Xnı 


shows that, as regards the predictions of the values of the increments AX, = 
Xn—Xn-1, the best we can get out of the ‘data’ ¥,_] is that the increment vanishes 
on average (with respect to .¥,—-1). This conforms with our innate perception that. 
the conditional gains E(A Xn | ¥,,-1) must vanish in a ‘fair’, ‘well-organized’ market, 
which, in turn, can be interpreted as the impossibility of riskless profits. It is in that 
connection that L. Bachelier wrote (in the English translation): “The mathematical 
expectation of the speculator is zero”. (We recall that, in gambling, the system 
when one doubles one’s stake after a loss and drops out after the first win is called 
the martingale; the conditional gains for this strategy are E(AXn | FZn-1) = 0; 
see [439; Chapter VII, § 1] for detail.) 

Finally, we point out that, as shows an empirical analysis of price evolution 
(Chapter IV, § 3c), the autocorrelation of the variables 


hn = n —, n>1, 


is close to zero, which can be regarded as an argument (albeit indirect one) in favor 
of the martingale conjecture. 
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9. The conjecture of efficient markets gave an impetus to the development of new 
financial instruments, suitable for ‘cautious’ investors adhering to the idea of diver- 
sification (see § 2b). 

Among these instruments we must first place a subvariety of ‘Mutual Funds’, 
the so-called ‘Index Funds’. 

The peculiarity of these funds is that they invest (their clients’) moncy in shares 
of corporations included into one or another ‘Index’ of stock. 

One of the first such funds was (and still is) Vanguard Index Trust—500 Portfolio 
(founded in 1976 by Vanguard Group; USA) that operates (buys and sells) shares of 
the firms included in Standard&Poor’s-500 Index, which comprises the stock of 500 
corporations (400 industrial companies, 20 transportation companies, 40 utilities, 
and 40 financial corporations). 

According to the conjecture of an efficient market, prices change (and therefore 
financial decisions also change—and rather promptly at that) when the informa- 
tion is updated. On the other hand, commonplace investors (either individuals or 
institutions) do not have sufficient information and usually cannot quickly respond 
to the changes of prices. Moreover, the overheads of a ‘lone trader’ can ‘eat up’ all 
his profits. 

For that reason investment in index funds can be attractive for those ‘underin- 
formed’ investors who do not expect ‘prompt-and-big’ profits, but prefer (cautious) 
well-civersified investment in long-term securities instead. 

Other examples of similar funds (issued by Vanguard) are Vanguard Index 
Trust-—Extended Market Portfolio, Vanguard Index Trust—Small Capitalization 
Stock Portfolio, and Vanguard Bond Index Fund, based mostly on American secu- 
rities, and also International Equity Fund—European Portfolio, Pacific Portfolio, 
and others based on foreign securities. 


§ 2b. Investment Portfolio. Markowitz’s Diversification 


1. We already noted in §2a that Markowitz’s paper [332] (1952) was decisive for 
the developinent of the modern theory and practice of financial management and 
financial engineering. The most attractive point for investors in his theory was 
the idea of the diversification of an investment portfolio, because it did not merely 
demonstrate a theoretical possibility to reduce (unsystematic) investment risks, but 
also gave recommendations how one could achieve that in practice. 

To clarify the basics and the main ideas of this theory we consider the following 
single-step investment problem. 

Assume that the investor can allocate his starting capital z at the instant n = 0 
among the stocks Aj,...,Ay at the prices S9(A1),...,50(Ayn), respectively. 

Let Xo(b) = b1 So(41) + +--+ bn So(An), where b; > 0, i = 1,...; N. In other 
words, let 


b= (b1,..., bN) 
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be the investment portfolio, where b; is the number of shares A; of value S9(Aj;). 
We assume the following law governing the evolution of the price of each share A;: 
its price Sı (4;) at the instant n = 1 must satisfy the difference equation 


AS1(A;) = p(A;)S0(Aa), 


or, equivalently, 


S1 (Ai) = (1+ p(Ai)) So(4:), 


where p(A;) is the random interest rate of A;, p(A;) > —1. 
If the investor has selected the portfolio b = (b1,..., bmn), then his initial capital 
Xo(b) = x becomes 


Xı (b) = b s1 (A1) +-+ bn Si (AN), 


and he would like to make the last value ‘a bit larger’. However, his desire must be 
weighted against the ‘risks’ involved. 

To this end Markowitz considers the following two characteristics of the capi- 
tal Xı (b): 


EXı(b), its expectation 
and 
DX (6), its variance. 
Given these parameters, there are several ways to pose an optimization problem of 
the best portfolio choice depending on the optimality criteria. 
For example, we can ask which portfolio b* delivers the maximum value for some 


performance f = f (EX; (b), DX1(b)) under the following ‘budget constraint’ on the 
class of admissible portfolios: 


Biz) = {b = (b1,... bN) : bi > 0, Xo(b) = zx}, x> 0. 
There exists also a natural variational setting: find 
inf DX 1 (b) 


over the portfolios b satisfying the conditions 


be B(x), 
EX 1(b) = m, 


where m is a fixed constant. 
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Fig. 8 depicts a typical pattern of the set of points (EX (b), /DX1 (b) ) such 
that the portfolio b belongs to B(x) and, maybe, satisfies also some additional 
constraints. 


EX, (b) 4 


> 
0 DXi (8) 


FIGURE 8. Illusration to Markowitz’s mean-variance analysis 


It is clear froin this picture that if we are interested in the mazimum mean 
value of the value of the portfolio with minimum variance, then we must choose 
portfolios such that the points (EX; (b), \/D-X1(8) ) lie on the thick piece of curve 
with end-points œ and 8. (Markowitz says that these are efficient portfolios and 
he terms the above kind of analysis of mean values and variances Mean-variance 
analysis.) 

2. We now claim that in the single-step optimization problem for an investment 
portfolio, in place of the quantities ($1(A1),...,51(Ay)) we can directly consider 
the interest rates (p(A1),--.,(An)). This means the following. 

Let b € B(x), i.e., let x = b1S9(A1) +--+ + by So(An). We introduce the 


quantities d = (d1...., dy) by the equalities 
q, = Solaa. 
x 
N 
Since b € B(x), it follows that dj > 0and Y` d; = 1. We can represent the portfolio 
i=1 


value X; (b) as the product 
X1 (b) = (1+ R(b)) Xo(b), 


and let 
p(d) = dyp(Ay) +--+ + dnp(An). 
Clearly, 
R(b) = Xb) _ , _ X10) 2 Si (Aa) _, 


Xo(b) x x 
= aay - 1 = Dal HR -1) = Dairy = ola. 
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Thus, 
R(b) = p(d), 
Sa (A; 
therefore ifd = (d1,...,dy) and b = (61,..., by) satisfy the relations d; = totdi). 
£ 
i= 1,..., N, then 
Xı(b) = «(1+ p(d)) 


for 6 € B(x). Hence, to solve an optimization problem for X; (b) we can consider 
the corresponding problein for p(d). 


3. We now discuss the issue of diversification as a means of reduction of the (un- 
systematic) risks to an arbitrarily low level, measured in terms of the variance or 
the standard error of X; (b). 

To this end we consider first a pair of random variables €; and €2 with finite 
second moments. If c} and cg are constants and o; = yDE;, i = 1,2, then 


D(c1éi + c22) = (101 — 202)? + 2c1c20102(1 + 012), 


aa and Cov(£1, £2) = Eé\é2 — E&1Eĉ2. Hence, if c10] = c209 
and 012 = —1, then, clearly, D(c1£1 + c22) = 0. 

Thus. if the variables €; and €2 have negative correlation with coefficient 
c2 = —1, then we can choose cı and cg such that c10 = c202 so as to obtain a 
combination c1] + c22 of variance zero. Of course, we can attain in this way a 
fairly small mean value E(c)£, + c22). (The case of c} = c2 = 0 is of no interest 
for optimization since b € B(x).) 

It becomes clear from these elementary arguments that, given our constraints 
on (c1, c2) and the class of the variables (€), 2), in the solution of the problem of 
making E(c1é1 + c2&2) possibly larger and D(c,&1 + c2£2) possibly smaller we must 
choose pairs (€1,€2) with covariance as close to —1 as possible. 

The above phenoinenon of negative correlation, which is also called the Markowitz 
phenomenon, is one of the basic ideas of investment diversification: in building an 
investment portfolio one must invest in possibly many negatively correlated securi- 
lies. 

Another concept at the heart of diversification is based on the following idea. 

Let €1,&2,...,€y be a sequence of uncorrelated random variables with variances 
Dé; < C,i=1,...,N, where C is a constant. Then 


where c12 = 


N N 
Dida +--+ dyEv) = Sd? DE <C> di. 
i=l i=1 


Hence, setting, e.g., dj = 1/N we see that 


Dhé +++ dyn) < S40 as N> œ. 
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This phenomenon of the absence of correlation indicates that, in the case of 
investment in uncorrelated securities, their number N must be possibly larger to 
reduce the risk (i.e., the variance D(djé) + - +++ dn€n)). 

We return now to the question of the variance Dp(d) of the variable 


pld) = dip(Ai1) +--+» + dnp(An). 
We have 


N N 
Dp(d) = S-djDp(Ai) + J. did; Cov(p(Az), p(Aj))- 
i=l ij=1, 


vj 
We now set d; = 1/N. Then 
N N 
1 \2 1 1 
2 — — . . — ig z= — Fe 
2i D(A) = (a) A > Pola) = 5 ON: 
1 N 
where 74 = W © Dp(A;) is the mean variance. Further, 
i=1 
N 
1 3 a 
XC didį Cov(p(Ai), p(4;)) = (x) (N* - N): Covy, 
ijl, 
ižj 
where 
L 1 x 
Cov = aay 2. Cov(o(Ai)s (49) 
ij=1, 
a] 


is the mean covariance. Thus, 
1 1\—— 
Dp(d) = 7N 4 (1- =) Cov, (1) 


and, clearly, if Ty < C and Covy > Cov as N > œ, then 


Dp(d) + Cov as N >o. (2) 


We see from this formula that if Cov is zero, then using diversification with N 
sufficiently large we can reduce the investment risk Dp(d) to an arbitrarily low level. 
Unfortunately, the prices on, say, a stockmarket usually have positive correlation 
(their change in a fairly coordinated manner, in the same direction), and therefore 
Covy does not approach zero as N > oo. The limit value of Cov is just the system- 
atic (or market) risk inherent to the market in question, which cannot be reduced 
by diversification, while the first term in (1) describes the unsystematic risks, which 
can be reduced, as shown, by a selection of a large number of stocks. (For a more 
detailed exposition of Mean-variance analysis, see [331|-[333] and [268].) 
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§2c. CAPM: Capital Asset Pricing Model 


1. To evaluate the optimal portfolio using mean-variance analysis one must know 
Ep(A;) and Cov(p(A;), p(A;)), and the theory does not explain how one can find 
these values. (In practice, one approximates them on the basis of conventional 
statistical means and covariances of the past data.) 

CAPM-theory (W. F. Sharpe [433] and J. Lintner, [301]) and APT-theory that 
we cousider below do not merely answer the questions on the values of Ep(A;) and 
Cov(p(A;), p(Aj)), but also exhibit the dependence of the (random) interest rates 
p(A;) of particular stocks A; on the interest rate p of the ‘large’ market of Aj. 
Besides the covariance Cov(p(A;), p(A;)), which plays a key role in Markowitz’s 
mean-variance analysis, CAPM distinguishes another important parameter, the co- 
variauce Cov(p(Ai), p) between the interest rates of a stock A and the market 
interest rate. 

CAPM bases its conclusions on the concept of equilibrium market, which implies, 
in particular, that there are uo overheads and all the participants (investors) are 
‘uniforin’ in that they have the same capability of prediction of the future price 
movement on the basis of information available to everybody, the same tiine horizon, 
aud all their decisions are based on the mean values and the covariances of the prices. 
It is also assumed that all the assets under consideration are ‘infinitely divisible’ 
and there exists a risk-free security (bank account, Treasury bills, ...) with interest 
rate r. 

The existence of a risk-free security is of crucial importance because the interest 
r enters all formulas of CAPM-theory as a ‘basis variable’, a reference point. 

It should be pointed out in this connection that long-term observations of 
the mean values Ep(A;) of interest rates p(A;) of risky securities A; show that 
Ep(A;) > r. Table 1, made up of the 1926-1985 yearly averages, enables one to 
compare the nominal and the real (with inflation taken into account) mean values 
of interest rates. 


TABLE 1 
Nominal Interest rate 
Security interest in real terms 
rate taking account of inflation 

Common stock 12% 8.8% 
Corporate bonds 5.1% 2.1% 
Government bonds 44% 1.4% 
Treasury bills 3.5% 0.4% 


2. We now explain the fundainentals of CAPM-theory using an example of a market 
operating in one step. 
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Let Sı = So(1+ p) be the value of some (randoin) price on some ‘large’ market 
(for example, we can consider the values of the S&P500 Index) at time n = 1. Let 
$1(A) = So(A)(1 + p(A)) be the price of the asset A (some stock from S&P500 
Index) with interest rate p(A) at time n = 1. 

The evolution of the price of the risk-free asset can be described by the formula 


Bı = Bo(1 + r), 
On the basis of the built-in concept of equilibrium market, CAPM-theory establishes 


(see, e.g., [268] or [433]) that for each asset A there exists a quantity B(A) (the beta 
of this asset®) such that 


E[p(A) - r] = B(A)E[p - r], (1) 
where 
play = See (2) 


In other words, the mean value of the ‘premium’ p(A) — r (for using the risky 
asset A in place of the risk-free one) is in proportion to the mean value of the 
premium p—r (of investinents in some global characteristic of the market, e.g., the 
S&P500 Index), 

Formula (2) shows that the value 3(A) of the ‘beta’ is defined by the correla- 
tion properties of the interest rates p and p(A) or, equivalently, by the covariance 
properties of the corresponding prices Sı aud 5(A). 

We uow rewrite (1) as follows: 


Ep(A) = r + B(A)E(p = r); (3) 


let pg be the value of the interest rate p(A) of the asset A with B(A) = 2. 
If 8 = 0, then 


Po zT, 
while if 6 = 1, then 
pl =p 


Bearing this in mind we see that (3) is the equation of the CAPM line 


“If cach asset A has its ‘beta’ G(A), then one would expect it to have also ‘alpha’ a( A). 
This is the name several authors (see, for example [267]) use for the mean value Ep(A). 
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plotted in Fig. 9 aud depicting the mean returns Epg from assets in their dependence 
on £, the interest rate r, and the mean market interest rate Ep. 


l 
l 
l 
| 
l 
l 
l 
l 
4 
1 


FIGURE 9. CAPM line 


The value of 8 = (A) plays an important role in the selection of investment 
portfolios; this is ‘a measure of sensitivity’, ‘a measure of the response’ of the asset 
to changes on the market. For definitness we assume that we measure the state of 
the market, in the values of the S&P500 Index and corporation A, whose shares we 
are now discussing, is one of the 500 firms in this index. If the index changes by 
1% and the ‘beta’ of the stock A is 1.5, then the change in the price of A is (on the 
average) 1.5%. 

In practice, one can evaluate the ‘beta’ of a certain asset using statistical data 
and conventional linear regression methods (since (3) is a linear relation). 


3. For the asset A we now consider the variable 


Clearly, E7(A) = 0 and 
E(n(A)(o - Ep)) = 0, 


i.e., the variables 7(A) and p — Ep with means zero are uncorrelated. Hence 
p(A) - Ep(A) = B(A) (p - Ep) + (A); (5) 


this brings us by (3) to the following relation between the premiums p(A) — r and 
prr: 
p(A) — r= B(A)(p— r) + n(A), (6) 
which shows that the premium (p(A)-— r) on the asset A is the sum of ‘beta’ (3(A)) 
times the market premium (p — r) and the variable 7(A) uncorrelated with p — Ep. 
By (5), ‘ 
Dp(A) = 6° (A)Dp + Dn(A), (7) 
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which means that the risk (Dp(A)) of an investment in A is a combination of two, 
the systematic risk (8 (A)Dp) 
inherent to the market and corresponding to the given ‘beta’ and the 
unsystematic risk (Dn(A)), 


which is the property of the asset A itself. 

As in the previous section, we claim that here, in the framework of CAPM, 
unsystematic risks can also be reduced by diversification. Namely, assume that 
there are N assets Aj,..., Ay on the ‘large’ market such that the corresponding 
variables 7(A1),...,7(An) are uncorrelated: Cov(7(A;),7(Aj)) = 0, i 4 j. 


N 
Let d = (d1,..., dy) be an investment portfolio with d; > 0, X d; = 1, and 


z=1 
p(d) = dı - p(A1) +--+ + dy: p(An). 
Since 
(As) -r = B(A;) [p - r] + (Aa), 

it follows that 

N N 

pld) -r= Y dibli) [p — r] + YO din(4;). 
i=1 i=1 


Hence setting 


we see (cf. (6)) that 
p(d) — r = B(d)(p - r) + n(d). 


Consequently, in a similar way to the previous section, 
Dp(d) = 8’ (d)Dp + Dn (d), 


N C 1 
where Dij(d) = © d?Dn(A;) < N > 0 as N > œ if Dn(A;) < C and d; = ma 
t=1 n 
Thus, in the framework of CAPM, diversification enables one to reduce non- 
systematic risks to arbitrarily small values by a selection of sufficiently many (N) 


assets. 
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We can supplement the above table of the mean values of interest rates by the 
table of their standard errors: 


TABLE 2 
Portfolio Standard Deviation 
Common stocks 21.2% 
Corporate bonds 8.3% 
Government bonds 8.2% 
Treasury bills 3.4% 


Taking the assumption that p ~ W(p,0), i.e., we have a normal distribution 
with expectation and deviation o, we obtain that the values of p lie with prob- 
ability 0.9 in the interval [u — 1.650, u + 1.650] and, with probability 0.67, in the 
interval [u — o, u + o]. 

Thus, combining these two tables we obtain the following confidence intervals 
(on the percentage basis): 

TABLE 3 


Portfolio 


Probability 0.67 Probability 0.9 
[-22.98, 46.98] 
[—8.59, 118.79] 
[—9.13, 17.93] 


[-2.11,9.11] 


Common stocks 


Corporate bonds 


Government bonds 


Treasury bills 


(Recall that we presented the corresponding mean values in Table 1.) 

We also present, for a few corporations A, the table of approximate values of 
their ‘betas’, the mean expected returns Ep(A) = r+((A)E(p—1r) and the standard 
errors \/Dp(A) (on the percentage basis): 


TABLE 4 
| Stocks A B(A) | Ep(A) | Bel) | 
AT&T 0.81 12.4 23.1 
Exxon 0.71 13.2 17.7 
| Compaq Computer | 1.73 20.1 | 57.3 | 


(The data are borrowed froin [344] and [55]; the standard errors are calculated 
on the basis of the data for the period 1981-86; the estimates of (A) and Ep(A) 
are presented as of the beginning of 1987 with r = 5.6% and E(p — r) taken to 
be 8.4%. 
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§2d. APT: Arbitrage Pricing Theory 


1. In the foreground of the CAPM-theory there is the question on the relation 
between the return from a particular asset on an ‘equilibrium’ market and the 
return on it on the ‘large’ market where this asset is traded (see (1) in § 2c) and of 
the concomitant risks. Here (see formula (6) in the preceding section) the return 
(interest) p(A) on an asset A is defined by the equality 


p(A) =r + B(A)(p =r) + (A). (1) 


A more recent theory of ‘risks and returns’, APT (Arbitrage Pricing Theory; 
S. A. Ross, R. Roll and S. A. Ross [410]), is based on a multiple-factor model. It 
takes for granted that the value of p(A) corresponding to A depends on several 


random factors f1,..., fq (which can range in various domains and describe, say, 
the oil prices, the interest rates, etc.) and a ‘noise’ term ¢(A) so that 
p(A) = ao(A) + a1(A) fr + ++ + aq(A) fg + C(A). (2) 


Here Ef; = 0, Df; = 1, and Cov(fj, fj) = 0 for i Æ j, while for the ‘noise’ term 
C(A) we have E¢(A) = 0, and it does not correlate with fy,..., fg or with the ‘noise’ 
terins corresponding to other assets. 

Comparing (1) and (2) we see that (1) is a particular case of the single-factor 
model with factor fı = p. In this sense APT is a generalization of CAPM, although 
the methods of the latter theory remain one of the favorite tools of practitioners in 
security pricing, for they are clear, simple, and there exists already a well-established 
tradition of operating ‘betas’, the measures of assets’ responses to market changes. 

One of the central results of CAPM-theory, based on the conjecture of an equi- 
libium market, is formula (1) in the preceding section, which describes the average 
premium E(p(A) — r) as a function of the average premium E(p — r). 

Likewise, the central result of APT, which is based on the conjecture of the 
absence of asymptotic arbitrage in the market, is the (asymptotic formula) for the 
mean value Ep(A) that we present below and that holds under the assumption that 
the behavior of p(A) corresponding to an asset A in question can be described by 
the multe-factor model (2). 

We recall that p(A) is the (random) interest rate of the asset A in the single-step 
model Sı (A) = So(A)(1 + p(A)) (considered above). 


2. Assume that we have an ‘N-market’ of N assets A1,..., Ay with q active factors 
fis- --» fg such that 


P(Ai) = ao(Ai) + a1 (Aa) fi +-+: + aq(Ai) fa + C(A:), 
where Ef, = 0, EÇ(A;) = 0, the covariance Cov(fp, fi) is zero for k £l, 


Df, = 1. Cov (fi; ¢(Ai)) = 0, and Cov (Ç(A;), ¢(Aj)) = g;j for k,l =1,...,q and 
j=l, N. 
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We now consider a portfolio d = (d1,...,dy). The corresponding return is 


p(d) = dy p(Ai) +--+ + dnp(An) 


N N N N 
= 5 diaig + 63 dian) h eee 2 diaa) fa+ 5 diÇ(Ai) (3) 
i=l i=l i=l i=l 


where Qik = ak(Ai). 
As shown below, under certain assumptions on the coefficients a; in (2) we can 
find a nontrivial portfolio d = (d,,...,dy) with d; = dj(N) such that 


dj +-:-+dy = 0, (4) 
N 
Stee,  k=1,...,4, (5) 
N N 
SS dain = De. (6) 
i=l i=l 


Then, for the portfolio 0d = (@d,,...,@dy) (here @ is a constant) we have 
p(Ad) = 8p(d), 
and by (2)-(6), 


N N 
p(Od) = 03° d? +8 >> diC(Ai). (7) 
i=l i=l 
Hence 
N 
n(8d) = Ep(0d) =0 9 d, 
at 


N 
o° (0d) = Dp(0d) = 0? X` djdjoiy. 
i j=l 
We now set 
N —2/3 , N 1/2 
a= (Soa) (=i, were a= (D) 
1 t=1 


Then 


TE » 


pa bis 1 dds 0%; 
(San t=1 ay? 


a? (@d) = (10) 
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Assuming (for the simplicity of analysis; see, e.g., [240] or [268] for the general case) 
that oj; = 0 for i A j and oj = 1, we obtain 


o° (0d) = (> a) (11) 


Formulas (9) and (11) play a key role in the asymptotic analysis below. They show 


N 
that if X d2 + co as N —> œ, then (0d) > œ and o*(@d) —> 0. However, if we 
i=1 
set So(A1) =--- = So( Ayn) = 1, then by the condition dı +---+dy = 0 we obtain 
that the initial value of the portfolio @d is 


Xo(Od) = 0(di +---+ dn) = 0, 


while its value at time n = 1 is 
Xı (0d) = d,S1(A1) +++ + dn 51 (An) = Op(d) = p(0d). 


Further, if EX1 (0d) = p(@d) > oo, while DX; (8d) > 0 as N > œ, then for 
sufficiently large N we have X1 (8d) > 0 with large probability and X; (8d)>0 with 
non-zero probability. In other words, starting with initial capital zero and operating 
on an ‘N-market’ with assets 41,..., AN, N È 1, one can build a portfolio bringing 
one (asymptotically) nontrivial profit. This is just what is interpreted in APT as 
the existence of asymptotic arbitrage. 

Thus, assuming that the ‘N-markets’ are asymptotically (as N —> oo) arbitrage- 
free we arrive at the conclusion that one must rule out the possibility of wale d? > 
oo, which would leave space for arbitrage. Of course, this imposes certain con- 
straints on the coefficients of the multi-factor model (2) because the selection of the 
portfolio d = (d1,... dN), dj = di(N), i < N, with properties (4)-(6) described 
below proceeds with an eye to these coefficients. 

We now consider the matrix 


1 ai aig alq 
1 a2) a22 ~.. Og 

s=]. a i (12) 
loan, N2 aNq 


On its basis, we construct another matrix, 
B= AAAA, (13) 


which we assume to be well-defined (here ‘*’ signifies transposition). 
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Let d= (1-2) 
= — #)ao, 
(14) 
e = Bag, 
where T is the identity matrix and ag is the column formed by a1ọ,.-., apo- Then 
we have the orthogonal decomposition 
ag=d+e (15) 
and 
d*1=0, dag = 0, (16) 
where a, is the column formed by aik, ---,@Nk and 1 is the column of ones. 
Formulas (16) are just the above-discussed relations (4) and (5). 
By (14) and (15) we also obtain 
d*ao = d*d + d*e = d*d 
which is the required formula (6). 
We note now that, by (14), the column e can be represented as follows: 
e = ol + à101 +--+ +Agqaq, 
where the numbers A9,...,Agq satisfy the relation 
(Aos -s Àg)” = (8*4)! of*ag. 
Hence 
q 
d= ao — ào1 — Ñ` Akak 
k=l 
and 
N N q 2 
aa @ ~r- >> Avain ) . (17) 
$= i=1 k=1 
Of course, for an ‘N-market’ all the coefficients aio, ---, Qik and Ao,..-,Az on 


the right-hand side of this formula depend on N. 
Our assumption of the absence of asymptotic arbitrage rules out the possibility 
of 


N 
lim 2 d?(N) =% 
2 


(here we write d;(N) to emphasize the dependence of the number of shares of one 
or another asset, on the total number of assets in the market), therefore by (17) the 
‘N-market’ coefficients must ensure the inequality 


N 


q 2 
lim > joo) — o(N) - SO MCN Jai (N) < 00. (18) 
i=1 k=1 
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This relation (deduced from the conjecture of the absence of asymptotic arbitrage) 
is interpreted in the framework of APT as follows: if the number N of assets taken 
into account in the selection of a securities portfolio is sufficiently large, then for 
‘the majority of assets’ we must ensure the ‘almost linear’ relation between the 
coefficients ag(A;),@1(Aj),.--, @q(Az): 


q 
ao(Ai) © ào + D> Anag(Aa), (19) 
k=l 


where all the variables under consideration depend on N and 
ao(Ai) = Ep(4;). 


Moreover, there exists a portfolio d = (d1,...,dy) such that the variance of the 
return p(d) is sufficiently small (in view of (11)), which means that the influence of 
the noise terms ¢(A;) and individual active factors fj can be reduced by means of 
diversification (provided that there is no asymptotic arbitrage). One should bear in 
mind, however, that all the above holds only for large N, i.e., for ‘large’ markets, 
while for small markets calculations of the return expectation Ep(A;) on the basis 
of the expression on the right-hand side of (19) can lead to grave errors. (As re- 
gards the corresponding precise assertion, see [231], [240], and [412]; for a rigorous 
matheinatical theory of asymptotic arbitrage based on the concept of contiguity, 
see [250], [260], [261], [273], and Chapter VII, §§ 3a, b,c). 


§ 2e. Analysis, Interpretation, and Revision 
of the Classical Concepts of Efficient Market. I 


1. The central idea, the cornerstone of the concept of effictent market is the as- 
sumption that the prices instantaneously assimilate new data and are always set in 
a way that gives one no opportunity to ‘buy cheap and sell immediately at a higher 
price elsewhere’, i.e., as one usually says, there are no opportunities for arbitrage. 

We have already shown that this idea of a ‘rationally’ organized, ‘fair’ market 
brings one to (normalized) market prices described by martingales (with respect to 
some measure equivalent to the initial probability measure). 

We recall that if X = (Xp)n zo is a martingale with respect to the filtration 
(Fn)n>o, then E(Xn4m | Fn) = Xn. Hence the optimal (in the mean-square sense) 
estimator arene for the variable Xn+m based on the ‘data’ Fn is simply the 
variable Xn because Knina is the same as E(Xn+m | Fn). 

Thus, we can say that our martingale conjecture for the prices (Xn) corresponds 
to the (economically conceivable) assumption that, on a ‘well-organized’ market, 
the best (in the mean-square sense, ìn any case) projection of the ‘tomorrow’ (‘the 
day after tomorrow’, etc.) prices that can be made on the basis of the ‘today’ 
information is the current price level. 
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In other words, the forecast is trivial. This seemingly rules out any chance of the 
prediction of ‘the future dynamics of the observable prices’. (In his construction of 
a Brownian motion as a model for this evolution of prices L. Bachelier started essen- 
tially just from this idea of the impossibility of a better forecast of the ‘tomorrow’ 
prices than a mere repetition of their current levels.) 

At the same time, it is well known that market operators (including such ex- 
perts as ‘fundamentalists’, ‘technicians’, and quantitative analysts, ‘quants’) are 
still trying to forecast ‘the future price movement’, foresee the direction of changes 
and the future price levels, ‘work out’ a timetable for buying and selling the stock 
of particular corporations, etc. 


Remark. ‘Fundamentalists’ make their decisions by looking at the state of the ‘econ- 
omy at large’, or some its sectors; the prospects of development are of particular 
interest for them; they build their analysis upon the assuinption that the actions 
of market operators are ‘rational’. Those teuding to ‘technical’ analysis concen- 
trate on the ‘local’ peculiarities of the market; they emphasize ‘mass behavior’ as 
a crucial factor. 

As noted in [385; pp. 15-16], the ‘fundamentalists’ and the ‘technicians’ were the 
two major groups of financial market analysts in 1920-50. A third group emerged 
in the 1950s: ‘quants’, quantitative analysts, who are followers of L. Bachelier. This 
group is stronger biased towards ‘fundamentalists’ than towards the proponents of 
the ‘technical’ approach, who attach more weight to market moods than to the 
rational causes of investors’ behavior. 


2. We now return to the problem of building a basis under aspirations and at- 
tempts to forecast the ‘future’ price movement. Of course, we must start with the 
analysis of empirical data, look for explanations of several non-trivial features (for 
example, the cluster property) characteristic for the price movement, and clear up 
the probabilistic structure of prices as stochastic processes. 


Let Hn = In Be be the logarithms of some (discounted) prices. We can represent 
0 
the Hn as the sums Hn = hy +--+ hn with hy = In zt x 
k-1 

If the sequence (Hp) is a martingale with respect to some filtration (Fn), then 
the variables (h,,) form a martingale-diference (E(hn | ¥n—1) = 0), and therefore the 
hn (assuming that they are square-integrable) are uncorrelated, i.e., Eknimhn = 9, 
m21,n21. 

As is well known, however, this does not mean that these variables are indepen- 
dent. It is not improbable that, say, hiim and h? or |hnim| and |hn] are positively 
correlated. The empirical analysis of much financial data shows that this is indeed 
the case (which does not contradict the assumption of the martingale nature of 
prices on an ‘efficient’ market!) It is also remarkable that this positive correlation, 
revealing itself in the behavior of the variables (hn)n>1 as the phenomenon of their 
clustering into groups of large or small values, can be ‘caught’, understood, by 
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means of several fairly simple models (e.g., ARCH, GARCH, models of stochastic 
volatility, etc.), which we discuss below in Chapter II. This indicates an opportunity 
(or at any rate, a chance) for a more non-trivial (and, in general, non-linear) pre- 
diction, e.g., of the absolute values |hn+m|. There also arises a possibility of making 
more refined conclusions about the joint distributions of the sequence (hn)n>1- 
Taking the simplest assumption on the probabilistic nature of the hn, let 


hn = OnEn, 


where the €n are independent standard normally distributed variables such that 
En ~ M (0,1) and the o, are some constants, the standard deviations of the hn: 
On = +V Dhn. 

However, this classical model of a Gaussian random walk has long been consid- 
ered inconsistent with the actual data: the results of ‘normality’ tests show that the 
empirical distribution densities of hn are more extended, more leptokurtotic around 
the mean value than it is characteristic for a normal distribution. The same analysis 
shows that the tails of the distributions of the hn are more heavy than in the case 
of a normal distribution. (See Chapter IV for greater detail.) 

In the finances literature one usually calls the coefficients on in the relation 
hn = On€n the volatilities. Here it’is crucial that the volatility is itself volatile: 
a = (op) is not only a function of time (a constant in the simplest case), but it is 
a random variable. (For greater detail on volatility see § 3a in Chapter IV.) 

Mathematically, this assumption seems very appealing because it extends consid- 
erably the standard class of (linear) Gaussian model and brings into consideration 
(non-linear) conditionally Gaussian models, in which 


hn = OnEn, 


where, as before, the (€n) are independent standard normally distributed variables 
(En ~ N(0,1)), but the on = on(w) are ¥,—1-measurable non-negative random 
variables. In other terms, 


Law(hn | Fn—1) = V (0, on) 


which means that Law(h,,) is a suspension (a mixture) of normal distributions 
averaged over some distribution of volatility (‘random variance’) of. 

It should be noted that, as is well known in mathematical statistics, ‘mixtures’ 
of distributions with fast decreasing tails can bring about distributions with heavy 
tails. Since such tails actually occur for empirical data (e.g., for many financial in- 
dexes), we may consider conditionally Gaussian schemes to be suitable probabilistic 
models. 

Of course, turning to models with stochastic volatility (and, in particular, of the 
‘hn = On€n’ kind), it is crucial for their successful application to give an ‘adequate’ 
description of the evolution of the volatility (on). 
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Already the first model, ARCH (Autoregressive Conditional Heteroskedastic- 
ity), proposed in the 1980s by R. F. Engle [140] and postulating that 


of =agt aih? ps hes 
enables one to ‘grasp’ the above-mentioned cluster phenomenon of the observable 
data revealed by the statistical analysis of financial time series. 

The essence of this phenomenon is that large (small) values of hn imply large 
(respectively, small) subsequent values (of uncertain sign, in general). 

Of course, this becomes clearer once one makes a reference to the assumption of 
ARCH-theory about the dependence of o2 on the ‘past’ variables eee ER hè =g: 
Thus, in conformity with observations, if we have a large value of |hn] in this 
model, then we should expect a large value of |kn+1|. We point out, however, 
that the model makes no forecasts of the direction of the price movement, gives no 
information on the sign of hy,+1. (One practical implication is the following advice 
to operators on a stochastic market: considering a purchase of, say, an option, one 
better buy simultaneously a call and a put option; cf. §4e in Chapter VI.) 

Little surprise that the ARCH model gave birth to many similar models, con- 
structed to ‘catch on’ to other empirical phenomena (besides the ‘clusters’), The 
best known anong these is the GARCH (Generalized ARCH) model developed by 
T. Bollerslev [48] (1986), in which 


o2 = ao +ayhr_y tees t Ogle ae +i tet Bro n—p- 
(One can judge the diversity of the generalizations of ARCH by the list of their 
names: HARCH, EGARCH, AGARCH, NARCH, MARCH, etc.) 

One of the ‘technical’ advantages of the GARCH as compared with the ARCH 
models is as follows: whereas it takes one to bring into consideration large values of q 
to fit the ARCH model to real data, in GARCH models we may content ourselves 
with small q and p. 

Having observed that the ARCH and the GARCH models explain the cluster 
phenomenon, we must also mention the existence of other empirical phenomena 
showing that the relation between prices and volatility is more subtle. Practitioners 
know very well that if the volatility is ‘small’, then prices tend to long-term rises 
or drops. On the other hand, if the volatility is ‘large’, then the rise or the decline 
of prices are visibly ‘slowing’; the dynamics of prices tends to change its direction. 

All in all, the inner (rather complex) structure of the financial market give one 
some hopes of a prediction of the price movement as such, or at least, of finding 
sufficiently reliable boundaries for this movement. (‘Optimists’ often preface such 
hopes with sententiae about market prices that ‘must remember their past, after 
all’, although this thesis is controversial and far from self-evident.) 

In what follows (see Chapters II and III) we describe several probabilistic and 
statistical models describing the evolution of financial time series in greater detail. 
Now, we dwell on criticism towards and the revision of the assumption that the 
traders operate on an ‘efficient’ market. 
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3. As said in § 2a, it is built in the ‘efficient market’ concept that all participants 
are ‘uniforin’ as regards their goals or the assimilation of new data, and they make 
decisions on a ‘rational’ basis. These postulates are objects of a certain criticism, 
however. Critics maintain that even if all participants have access to the entire 
inforination, they do not respond to it or interpret it uniformly, in a homogeneous 
manner; their goals can be utterly different; the periods when they are financially 
active can be various: from short ones for ‘speculators’ and ‘technicians’ to long 
periods of the central banks’ involvement; the attitudes of the participants to various 
levels of riskiness can also be considerably different. 

It has long been known that people are ‘not linear’ in their decisions: they are 
less prone to take risks when they are anticipating profits and more so when they 
confront a prospect of losses [465]. The following dilemma formulated by A. Tversky 
of the Stanford University can serve an illustration of this quality (see [403]): 


a) “Would you rather have $ 85 000 or an 85% chance of $ 100000?” 
b) “Would you rather lose $85 000 or run an 85% risk of losing $ 100000?” 


Most people would better take $ 85000 and not try to get $ 100000 in case a). 
In the second case b) the majority prefers a bet giving a chance (of 15%) to avoid 
losses. 

One factor of prime importance in investment decisions is time. Given an op- 
portunity to obtain either $5000 today or $5150 in a month, you would probably 
prefer money today. However, if the first opportunity will present itself in a year 
and the second in 13 months, then the majority would prefer 13 months. That 
is, investors can have different ‘time horizons’ depending on their specific aims, 
which, generally speaking, conforms badly with ‘rational investment’ models as- 
suming (whether explicitly or not) that all investors have the same ‘time horizon’. 

We noted in § 2a.3 another inherent feature of the concept of efficient market: the 
participants must correct their decisions instantaneously once they get acquainted 
with new data. Everybody knows, however, that this is never the case: people need 
certain time (different for different persons) to ponder over new information and 
take one or another decision. 


4. As G. Soros argues throughout his book [451], apart from ‘equilibrium’ or ‘close 
to equilibrium’ state, a market can also be ‘far from equilibrium’. Addressing the 
idea that the operators on a market do not adequately perceive and interpret infor- 
mation, G. Soros writes: “We may distinguish between near-equilibrium conditions 
where certain corrective mechanisms prevent perceptions and reality from drifting 
too far apart, and far-from-equilibrium conditions where a reflexive double-feedback 
mechanisin is at work and there is no tendency for perceptions and reality to come 
closer together without a significant change in the prevailing conditions, a change 
of regime. In the first case, classical economical theory applies and the divergence 
between perceptions and reality can be ignored as mere noise. In the second case, 
the theory of equilibrium becomes irrelevant and we are confronted with a one- 
directional historical process where changes in both perceptions and reality are 
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irreversible. It is important to distinguish between these two different states of 
affairs because what is normal in one is abnormal in the other.” [451, p. 6]. 

The views of R. B. Olsen [2], the founder of the Research Institute for Applied 
Economics (“Olsen & Associates”, Zürich), fall in unison with these words: “There 
is a broad spectrum of market agents with different time horizons. These horizons 
range from one minute for short-term traders to several years for central bankers 
and corporations. The reactions of market agents to an outside event depend on 
the framework of his/her time horizon. Because time horizons are so different and 
vary by a factor of one million, economic agents take different decisions. This leads 
to a ripple effect, where the heterogeneous reactions of the agents are new events, 
requiring in turn secondary reactions by market participants.” 

It is apparently too early so far to speak of a rigorous mathematical theory 
treating financial markets as ‘large complex systems’ existing in the environment 
close to actually observed rather than in the classical ‘equilibrium’ one. We can 
define the current stage as that of ‘data accumulating’, ‘model refining’. Here new 
methods of picking and storing information, its processing and assessing (which we 
discuss below, see Chapter IV) are of prime importance. All this provides necessary 
empirical data for the analysis of various conjectures pertaining to the operations of 
securities markets and for the correction of postulates underlying, say, the notion of 
an efficient market or assumptions about the distribution of prices, their dynamics, 
etc. 


§2f. Analysis, Interpretation, and Revision 
of the Classical Concepts of Efficient Market. II 


1. In this section we continue the discussion (or, rather, the description) of the 
assumptions underlying the concepts of efficient market and rational behavior of 
investors on it. We concentrate now on several new aspects, which we have not yet 
discussed. 

The concept of efficient market was a remarkable achievement at its time, which 
has played (and plays still) a dominant role both in financial theory and the business 
of finance. It is therefore clear that pointing out its strong and weak points can 
help the understanding of neoclassical ideas (e.g., of a ‘fractal’ structure of the 
market) common nowadays in the economics and mathematics literature devoted 
to financial markets’ properties and activities. 


2. We have already seen that the concept of efficient market is based on the as- 
sumption that the ‘today’ prices are set with all the available information com- 
pletely taken into account and that prices change only when this information is 
updated, when ‘new’, ‘unexpected’, ‘unforeseen’ data become available. Moreover, 
investors on such a market believe the established prices to be ‘fair’ because all the 
participants act in a ‘uniform’, ‘collectively rational’ way. 

These assumptions naturally make the random walk conjecture (that the price 
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is a sum of independent terms) and its generalization, the ‘martingale conjecture’ 
(which implies that the best forecast of the ‘tomorrow’ price is its current level) 
took quite plausible. 

All this can be expressed by the phrase ‘the market is a martingale’, i.e., one 
plays fairly on an ‘efficient market’ (which is consistent with the traditional explana- 
tion of the word ‘martingale’; see Chapter II, §§ 1b,c and, for example, [439; Chap- 
ter VII, § 1], where more details are given). 

The reader must have already observed that, in fact, the concept of ‘efficient 
market’ (§ 2a) simply postulates that ‘an efficient market is a martingale’ (with 
respect to one or another ‘information flow’ and a certain probability measure). 
The corresponding arguments were not mathematically rigorous, but rather of the 
intuitive and descriptive nature. 

In fact, this assertion (‘the market is a martingale’) has an irreproachable math- 
ematical interpretation, provided that we start from the conjecture that (by defi- 
nition) a ‘fair’, ‘rationally’ organized market is an arbitrage-free market. In other 
words, this is a market where no riskless profits are possible. (See § 2a in Chapter V 
for the formal definition.) 

As we shall see below, one implication of this assumption of the absence of ar- 
bitrage is that there exists, generally speaking, an entire spectrum of (‘martingale’) 
measures such that the (discounted) prices are martingales with respect to these 
measures. This means more or less that the market can have an entire range of sta- 
ble states, which, in its turn, is definitely related to the fact that market operators 
have various aims and different amounts of time to process and assimilate newly 
available information. 

This presence of investors with different interests and potentials is a positive 
factor rather than a deficiency it may seem at the first glance. 

The fact is that they reflect the ‘diversification’ of the market that ensures 
its liquidity, its capacity to transform assets promptly into means of payment (e.g., 
money), which is necessary for stability. We can support this thesis by the following 
well-known facts (see, e.g., [386; pp. 46-47]). 

The day of President J. F. Kennedy’s assassination (22.11.1963) markets imme- 
diately responded to the ensuing uncertainty: the long-term investors either sus- 
pended operations or turned to short-term investment. The exchanges were then 
closed for several days, and when they re-opened, the ‘long-term’ investors, guided 
by ‘fundamental’ information, returned to the market. 

Although the complete picture of the well-known financial crash of October 19, 
1978 in the USA is probably not yet understood, it is known, however, that just be- 
fore that date ‘long-term’ investors were selling assets and switching to ‘short-term’ 
operations. The reasons lay in Federal Reserve Systein’s tightening of monetary 
policy and the prospects of rises in property prices. As a result, the market was 
dominated by short-term activity, and in this environment ‘technical’ information 
(based, as it often is in instable times, on hearsay and speculations) came to the 
forefront. 


2. Financial Markets under Uncertainty 67 


In both examples, the long-term investors’ ‘run’ from the market brought about 
a lack of liquidity and, therefore, instability. All this points to the fact that, for 
stability, a market must include operators with different ‘investment horizons’, there 
must be ‘nonhomogeneity’, ‘fractionality’ (or, as they put it, ‘fractality’) of the 
interests of the participants. 

The fact that financial markets have the property of ‘statistical fractality’ (see 
the definition in Chapter III, §5b) was explicitly pointed out by B. Mandelbrot as 
long ago as the 1960s. Later on, this question has attracted considerable attention, 
reinforced by newly discovered phenomena, such as the discovery of the statistical 
fractal structure in the currency cross rates or (short-term) variations of stock and 
bond prices. 

As regards our understanding of what models of evolution of financial indicators 
are ‘correct’ and why ‘stable’ systems must have ‘fractal’ structure, it is worthwhile 
to compare deterministic and statistical fractal structures. In this connection, we 
provide an insight into several interconnected issues related to ‘non-linear dynamic 
systems’, ‘chaos’, and so on in Chapter II, § 4. 

We have already mentioned that we adhere in this book to an irreproachable 
mathematical theory of the ‘absence of arbitrage’. We must point out in this con- 
nection that none of such concepts as efficiency, absence of arbitrage, fractality 
can be a substitution for another. They supplement one another: for instance, 
many arbitrage-free models have fractal structure, while fractional processes can be 
martingales (with respect to some martingale measures; and then the correspond- 
ing market is arbitrage-free), but can also fail to be martingales, as does, e.g., a 
fractional Brownian motion (for the Hurst parameter H in (0, 3) U (3. 1)). 


3. In a similar way to §2a, where we gave a descriptive definition of an ‘efficient’ 
market, it seems appropriate here, for a conclusion, to sum up the characteristic 
features of a market with fractal structure (a fractal market in the vocabulary 
of [386]) as follows: 


1) at each instant of time prices on such a market are corrected by investors on 
the basis of the information relevant to their ‘investment horizons’; investors 
do not uecessarily respond to new information momentarily, but can react 
after it has been reaffirmed; 

2) ‘technical information’ and ‘technical analysis’ are decisive for ‘short’ time 
horizons, while ‘fundamental information’ comes to the forefront with their 
extension; 

3) the prices established are results of interaction between ‘short-term’ and 
‘long-term’ investors; 

4) the ‘high-frequency’ component in the prices is caused by the activities 
of ‘short-term’ investors, while the ‘low-frequency’, ‘smooth’ components 
reflect the activities of ‘long-term’ ones. 

5) the market loses ‘liquidity’ and ‘stability’ if it sheds investors with various 
‘investineut horizons’ and loses its fractal character. 
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4. As follows from the above we mainly consider in this book models of (‘efficient’) 
markets with no opportunities for arbitrage. 

Exaniples of such models that we thoroughly discuss in what follows are the 
Bachelier model (Chapter III, §4b and Chapter VIII, § 1a), the Black-Merton— 
Scholes model (Chapter III, § 4b and Chapter VII, § 4c), and the Coz—Ross—Rubin- 
stein model (Chapter II, § le and Chapter V, §1d) based on a linear Brownian 
motion, a geometric Brownian motion, and a geometric random walk, respectively. 

It seems fairly plausible after our qualitative description of ‘fractal’ markets that 
there may exist markets of this kind with opportunities for arbitrage. 

Simplest examples of such models are (as recently shown by L. R. C. Rogers 
in “Arbitrage with fractional Brownian motion”, Mathematical Finance, 7 (1997), 
95-105) modified models of Bachelier and Black—-Merton—Scholes, in which a Brown- 
tan motion is replaced by a fractional Brownian motion with H € (0, 3), UGG, 1). 
See also Chapter VII, § 2c below. 


Remark. A fractional Brownian motion with H € (0, 4) U (3,1) is not a semi- 


martingale, therefore the corresponding martingale measures are nonexistent; see 
Chapter ITI, § 2c for details. This is an indirect indication (cf. the First fundamental 
theorem in Chapter V, §2d) that there may (not necessarily!) exist arbitrage in 
such inodels. 


3. Aims and Problems of Financial Theory, 
Engineering, and Actuarial Calculations 


§ 3a. Role of Financial Theory and Financial Engineering. 
Financial Risks 


1. Our discussion of financial structures and markets operating under uncertainty 
in the previous sections clearly points out RISKS as one of the central concepts 
both in Financial theory and Insurance theory. 

This is a voluminous notion the contents of which is everybody’s knowledge. 

For instance, the credit-lending risk is related to the losses that the lender can 
incur if the borrower defaults. 

Operational risk is related to errors possible, e.g., in payments. 

Investment risk is a consequence of an unsatisfactory study of detail of an in- 
vestment project, miscalculated investment decisions, possible changes of the eco- 
nomical or the political situation, and so on. 

Financial mathernatics and engineering (in its part relating to operations with 
securities) tackles mostly market risks brought about by the uncertainty of the 
developimeut of market prices, interest rates, the unforeseeable nature of the actions 
and decisions of market operators, and so on. 

The steep growth in attention paid by Financial theory to mathematical and 
engineering aspects (particularly ostensible during the last 2-3 decades) must have 
its explanation. The answer appears to be a simple one; it lies in the radical changes 
visible in the financial markets: the changes of their structure, greater volatility of 
prices, emergence of rather sophisticated financial instruments, new technologies 
used in price analysis, and so on. 

All this imposes heavy weight on financial theory and puts forward new prob- 
lems, whose solution requires a rather ‘high-brow’ mathematics. 


2. At the first glance at financial mathematics and engineering, when one makes an 
emphasis on the ‘game’ aspect (‘investor’ versus ‘market’), one can get an impression 
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that it is the main aim of financial mathematics to work out recommendations and 
develop financial tools enabling the investor to ‘gain over markets’; at any rate, 
‘not to lose very much’. 

However, the role of financial theory (including financial mathematics) and fi- 
nancial engineering is much more prominent: they must help investors with the 
solution of a wide range of problems relating to rational investment by taking 
account of the risks unavoidable given the random character of the ‘economic en- 
vironment’ and the resulting uncertainties of prices, trading volumes, or activities 
of market operators. 

Financial mathematics and engineering are useful and important also in that 
their recommendations and the proposed financial instruments play a role of a 
‘regulator’ in the reallocation of funds, which is necessary for better functioning of 
particular sectors and the whole of the economy. 

The analysis of the securities market as a ‘large’ and ‘complex’ system calls for 
complicated, advanced mathematics, methods of data processing, numerical meth- 
ods, and computing resources. Little wonder, therefore, that the finances literature 
draws on the most up-to-date results of stochastic calculus (on Brownian motion, 
stochastic differential equations, local martingales, predictability, ...), mathemati- 
cal statistics (bootstrap, jackknife, ...), non-linear dynamics (deterministic chaos, 
bifurcations, fractals, ...), and, of course, it would be difficult to imagine modern 
financial markets without advanced computers and telecommunications. 


3. Markowitz’s theory was the first significant trial of the strength of probabilis- 
tic methods in the minimization of unavoidable commercial and financial risks by 
building an investment portfolio on a rational basis. 

As regards ‘risks’, the fundamental economic postulates are as follows: 


1) ‘big’ profits mean big risks 
and, on the way to these profits, 
2) the risks must be ‘justifiable’, ‘reasonable’, and ‘calculated’. 


We can say in connection with 1) that ‘big profits are a compensation for risks’; 
it seems appropriate to recall the proverb: “Nothing ventured, nothing gained”. 
In connection with 2), we recall that ‘to calculate’ the risks in the framework of 
Markowitz’s theory amounts to making up an ‘efficient’ portfolio of securities en- 
suring the maximuin average profit that is possible under certain constraints on 
the size of the ‘risk’ measured in terms of the variance. We point out again that 
the idea of diversification in building an ‘efficient’ portfolio has gained solid ground 
in financial mathematics and became a starting point for the development of new 
strategies (e.g., hedging) and various financial instruments (options, futures, ...). 
It can be expressed by the well-known advice: “Don’t put all your eggs in one 
basket.” 

Meaui-variance analysis is based on a ‘quadratic performance criterion’ (it mea- 
sures the risks in terms of the variance of return on a portfolio) and assumes that the 
market in question is ‘static’. Modern analysts consider also more general quality 
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functions and utility functions in their study of optimal investment, consumption, 
and allocation of resources. It is important to point out here a new aspect: the 
time dynamics, the need to take solutions ‘successively’, ‘in stages’. Incidentally, 
decisions must be taken on the basis of generally accessible information and without 
anticipation. ‘Statics’ means that we are interested in the profits at time N > 0, 
while the portfolio is made up at time n = 0 (i.e., it is #p-measurable). Allow- 
ing for ‘dynamics’, we are tracing development in time, so that the returns at a 
particular instant n < N are defined by the portfolio built on the basis of infor- 
mation obtained by time n — 1 (inclusive; i.e., the components of the portfolio are 
¥y—1-measurable), and so on. 

Clearly, the incorporation of the dynamical aspect in the corresponding prob- 
lems of financial mathematics calls for concepts and methods of optimal stochastic 
control, stochastic optimization, dynamical programming, statistical sequential anal- 
ysis, and so on. 

In our discussion of the risks relating to ‘uncertainties’ of all kinds we cannot 
bypass the issues of ‘risk’ insurance, which are the subject of Insurance theory or, 
somewhat broader, the Theory of actuarial calculations. 

In the next section we describe briefly the formation of the insurance business 
providing a mechanism of compensation for financial losses, the history and the 
role of the ‘actuarial trade’. We shall not use this information in what follows; 
nevertheless, we think it appropriate here, since it can give one an idea of a realm 
that is closely related to finance. Moreover, it becomes increasingly clear that 
financial and actuarial mathematics share their ideas and their method. 


§3b. Insurance: a Social Mechanism of Compensation 
for Financial Losses 


1. An actuarius in ancient Rome was a person making records in the Senate’s Acta 
Publica or an army officer processing bills and supervising military supplies. 

In its English version, the meaning of the word ‘actuary’ has undergone several 
changes. First, it was a registrar, then a secretary or a counselor at a joint stock 
company. In due course, the trade of an actuary became associated with one 
who performed mathematical calculations relating to life expectancies, which are 
fundamental for pricing life insurance contracts, annuities, etc. 

In modern usage an actuary is an expert in the mathematics of insurance. They 
are often called social mathematicians, for they keep key positions it working out 
the strategy not only of insurance companies, but also in pension and other kinds 
of funds; government actuaries are in charge of various issues of state insurance, 
pensions, and other entitlement schemes. 

Insurance (assurance) is a social mechanism of compensation of individuals and 
organizations for financial losses incurred by some or other unfavorable circum- 
stances. 
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The destination of insurance is to put certainty in place of the uncertainties of 
financial evaluations that are due to possible losses in the future. 

Insurance can be defined as a social instrument that enables individuals or 
organizations to pay in advance to reduce (or rule out) certain share of the risk of 
losses, 

Mankind was fairly quick at comprehending that the most efficient way to lessen 
the losses from uncertainties is a cooperation mechanism distributing the cost of the 
losses born by an individual over all participants. Individuals were smart enough 
to realize that it is difficult to forecast the timing, location, and the scope of events 
that can have implications unfavorable for their economic well-being. Insurance 
provided an instrument that could help a person to lessen, ‘tame’ the impact of the 
uncertain, the unforeseen, and the unknown. 

It would be wrong to reduce insurance (and the relevant mathematics) to, say, 
property and life insurance. It must be treated on a broader scale, as risk insurance, 
which means the inclusion of, e.g., betting in securities markets or investments 
(foreign as well as domestic). We shall see below that the current state of insurance 
theory (in the standard sense of the term ‘insurance’) is distinguished by extremely 
close interrelations with financial theory. One spectacular example here are futures 
reinsurance contracts floated at the Chicago Board of Trade in December, 1992. 

Not all kinds of ‘uncertainty’, not all risks are subject to insurance. One usually 
uses here the following vocabulary and classification, which is helpful in outlining 
the realm of insurance. 


2. All ‘uncertainties’ are usually classified with one of the two groups: pure uncer- 
tainties and speculative uncertainties. 

A speculative uncertainity is the uncertainty of possible financial gains or losses 
(generally speaking, one does not sell insurance against such uncertainties). 

A pure uncertainity means that there can be only losses (for example, from fire); 
many of these can be insured against. 

Often, one uses the same word ‘risk’ in the discussion of uncertainties of both 
types. (Note that, in insurance theory, one often identifies risk and pure uncertainty; 
on the other hand, a popular finances journal ‘Risk’ is mostly devoted to speculative 
uncertainties.) 

The sources of risks and of the losses incurred by them are accidents, which are 
also classified in insurance with one of the two groups: physical accidents and moral 
accidents. 

Moral accidents are about dishonesty, unfairness, negligence, vile intentions, etc. 

With physical accidents one ranks, for instance, earthquakes, economic cycles, 
weather, various natural phenomena, and so on. 

There are different ways to ‘combat’ risks: 


1) One can deliberately avert risks, avoid them by rational behavior, decisions, 
and actions. 
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2) One can reduce risks by transferring possible losses to other persons or 
institutions. 

3) One can try to reduce risks by making forecasts. Statistical methods are 
an important weapon of an actuary in forecasting possible losses. Forecast- 
making and accumulation of funds are decisive for a sustained successful 
insurance business. 

Although insurance is a logical and in many respects remarkable ‘remedy’ against 
risks, not all uncertainties and accompanying financial losses can be covered by it. 
Insurance risks must satisfy certain conditions; namely, 

1) there must exist a sufficiently large group of ‘uniformly’ insurable partici- 

pants with ‘characteristics’ stable in time; 

) insurance cases should not affect many participants at one time; 

3) these cases and the range of losses should not be consequences of deliberate 
actions of those insured: they must be accidental; damages are paid only if 
the cause of losses can be determined; 

4) the potential losses ensuing from the risks in question must be easily iden- 
tifiable (‘difficult to counterfeit’); 

5) the potential losses must be sufficiently large (‘there is no point in the 
insurance against small, easily recoupable losses’); 

6) the probability of losses must be sufficiently srnall (insurance cases must be 
rare), so as to make the insurance econornically affordable; 

7) statistical data must be accessible and can serve a basis for the calcula- 
tious of the probabilities of losses (‘representativeness and the feasibility of 
statistical estimation’). 

These and similar requirements are standard, they make up a minimum dose of 
conditions enabling insurance. 

Several types (kinds) of insurance are known. Their diversity is easier to sort 
out if one considers 
(a) classes of insured objects 
(b) unfavorable factors that can lead to insurance risks 
(c) modes of payment (premiums, benefits) 

(d) types of insurance (social versus commercial). 

Group (a) includes various classes of objects, for instance, life, health, property, 
cars, ships, stocks of merchandise, and so on. 

Group (b) includes the above-mentioned physical and moral accidents. 

The modes of payment (of premiums and benefits) in group (c) vary with insur- 
ance policies; life insurance is subject to most diversification. 

Different types of insurance (d)—either free (commercial) or obligatory (social)— 
are based on sunilar principles but differ in their philosophy and ways of organiza- 
tion. 


3. The content of insurance, its aims and objectives are difficult to understand 
without a knowledge of both its structure and history. 
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We can say that, in a sense, insurance is as old as the mankind itself. 

The most ancient forms of insurance known are the so-called bottomry and re- 
spondencia contracts (for sea carriages) discovered in Babylonian texts of 4-3 mil- 
lennia BC. 

Bottomry, which was essentially a pledge contract, was formally a loan (e.g., in 
the form of goods that must be delivered to some other place) made to the ship 
owner who could give not only the ship or another ‘tangible’ property as collateral, 
but also his own life and the lives of his dependents (which means that they could 
be reduced to slavery). In the case of a respondencia the collateral was always 
secured by goods. 

The Babylonians developed also a system of insurance contracts in which the 
supplier of goods, in case of a risky carriage, agreed to write off the loan if the 
carrier were robbed, kidnapped for ransom, and so on. 

The Hammurabe code (2100 BC) legalized this practice. It also provided for 
damages and compensations (by the government) to individuals who had suffered 
from fire, robbery, violence, and so on. 

Later on, the practices of making such contracts were adopted (through the 
Phoenicians) by the Greeks, the Romans, and the Indians. It was mentioned in 
early Roman codes and Byzantine laws; it is reflected by the modern insurance 
legislation. 

The origins of life insurance can be traced back to Greek thiasoi and eranot or to 
Roman collegia, from 600 BC to the end of the Roman empire (which is traditionally 
put at 476 AC). Originally, the collegia (guilds) were religious associations, but with 
time, they took upon themselves more utilitarian functions, e.g., funeral services. 
Through their funeral collegia the Romans paid for funerals; they also developed 
some rudimentary forms of life insurance. The lawyer Ulpian (around 220 AD) 
made up mortality tables (rather crude ones). 

The practice of premium insurance has apparently started in Italian city re- 
publics (Venice, Pisa, Florence, Genoa), around 1250. The first precisely dated 
insurance contract was made in Genoa, in 1347. The first ‘proper’ life insurance 
contract, which related to pregnant women and slaves, was also signed there (1430). 

Annuity contracts were known to the Romans as long ago as the 1 century AD. 
(These contracts stipulate that the insurance company is paid some money and it 
pays back periodic benefits during a certain period of time or till the end of life. 
This is in a certain sense a converse of life insurance contracts, where the insured 
pays premium on a regular basis and obtains certain sum in bulk.) 

The text of Rome’s Palcidian law included also a table of the mean values of life 
expectancy necessary in the calculations of annuity payments and the like. In 225, 
Ulpian made up more precise tables that were in use in Tuscany through the 18th 
century. 

Three hundred years ago, in 1693, Edmund Galley (whose name is associated 
with a well-known comet) improved Ulpian’s actuarial tables by assuming that the 
mortality rate in a fixed group of people is a deterministic function of time. In 1756, 
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Joseph Podson extended and corrected Galley’s tables, which made it possible to 
draw up a year-to-year ‘scale of premiums’. 

With the growth of cities and trade in the medieval Europe, the guilds extended 
their practice of helping their members in the case of a fire, a shipwrecks, an attack 
of pirates, etc.; provided them aid in funerals, in the case of disability, and so on. 
Following the step of the 14th-century Genoa, sea insurance contracts spread over 
virtually all European marine nations. 

The modern history of sea insurance is primarily ‘the history of Lloyd’s’, a 
corporation of insurers and insurance brokers founded in 1689, on the basis of 
Edward Lloyd’s coffee shop, where ship owners, salesmen, and marine insurers used 
to gather and make deals. It was incorporated by an act of British Parliament 
in 1871. Established for the aims of marine insurance, Lloyd’s presently operates 
almost all kinds of risks. 

Since 1974, Lloyd's publishes a daily Gazette providing the details of sea travel 
(and, nowadays, also of air flights) and information about accidents, natural disas- 
ters, shipwrecks, etc. 

Lloyd’s also publishes weekly accounts of the ships loaded in British and con- 
tinental ports and the dates of the end of shipment. General information on the 
insurance inarket can also be found there. 

In 1760, Lloyd’s gave birth to a society for the inspection and the classification 
of all sea-going ships of at least 100 tons’ capacity. Lloyd’s surveyors examine and 
classify vessels according to the state of their hulls, engines, safety facilities, and so 
on. This society also provides technical advice. 

The yearly “Lloyd’s Register of British and Foreign Shipping” provides the data 
necessary for Lloyd’s underwriters to negotiate marine insurance contracts even if 
the ship in question is thousands miles away. 

The great London fire of 1666 gave an impetus to the development of fire insur- 
ance. (The first fire insurance company was founded in 1667.) Insurance against 
break-downs of steam generators was launched in England in 1854; one can insure 
employers’ liability since 1880, transport Hability since 1895, and against collisions 
of vehicles since 1899. 

The first fire insurance companies in the United States emerged in New York 
in 1787 and in Philadelphia in 1794. Virtually from their first days, these companies 
began tackling also the issues of fire prevention and extinguishing. The first life 
insurance company in the United States was founded in 1759. 

The big New York fire of 1835 brought to the foreground the necessity to have 
reserves to pay unexpected huge (‘catastrophic’) damages. The great Chicago fire 
of 1871 showed that fire-insurance payments in modern, ‘densely built’ cities can be 
immense. The first examples of reinsurance (when the losses are covered by several 
firms) related just to fire damages of catastrophic size; nowadays, this is: standard 
practice in various kinds of insurance. As regards other first American examples 
of modern-type insurance, we can point out the following ones: casualty insurance 
(1864), liability insurance (1880), insurance against burglary (1885), and so on. 
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The first Russian joint-stock company devoted to fire insurance was set up in 
Siberia, in 1827, and the first Russian firm insuring life and incomes (The Russian 
Society for Capital and Income Insurance) was founded in 1835. 

In the 20th century, we are eyewitnesses to the extension of the sphere of insur- 
ance related to ‘inland marine’ and covering a great variety of transported items 
including tourist baggages, express mail, parcels, means of transportation, transit 
goods, and even bridges, tunnels, and the like. 

These days one can get insured against any conceivable insurable risk. Such 
firms as Lloyd’s insure dancers’ legs and pianists’ fingers, outdoor parties against 
the consequences of bad weather, and so on. 

Since the end of 19th century one could see a growing tendency of government 
involvement in insurance, in particular, in domains related to the protection of 
employees against illness, disability (temporal or perpetual), old age insurance, and 
unemployment insurance. Germany was apparently a pioneer in the so-called social 
security (laws of 1883-89). 

In the middle of this century, a tendency to mergers and consolidations in insur- 
ance business became evident. For instance, there was a consolidation of American 
life and property insurance companies in 1955-65. A new form of ‘merger’ became 
widespread: a holding company that owns shares of other firms, not only insurance 
companies, but also with businesses in banking, computer services, and so on, The 
strong point of these firms is their diversification, the variety of their potentialities. 
The burden of taxes imposed on an insurance firm is lighter if it is a part of a 
holding company. A holding company can get involved with foreign stock, which is 
sometimes impossible for insurance firms. This gives insurance companies greater 
leverage: they have access to larger resources while investing less themselves. 


4. It is impossible to consider now the insurance practices and theory separated 
from the practices and the theory of finance and investment in securities. 

It is as good as established that derivative instruments (futures, options, swaps, 
warrants, straddles, spreads, and so on) will be in the focus of the future global 
financial system. Certainly, the pricing methods used in finance will penetrate 
the actuarial science ever more deeply. In this connection, it is worthwhile and 
reasonable not to separate actuarial and financial problems related, one way or 
another, to various forms of risks, but to tackle them in one package. In favor of this 
opinion counts also the following division of the history of insurance mathematics 
into periods, proposed by H. Bühlmann, the well-known Swiss expert in actuarial 
science, 

The first persod (‘insurance of the first kind’) dates back to E. Galley who, as 
already mentioned, drew up insurance tables (1693) based on the assumption that 
the mortality rate in a fixed group of people is a deterministic function of time. 

The second period (‘insurance of the second kind’) is connected with the intro- 
duction of probabilistic ideas and the methods of probability theory and statistics 
into life insurance and other types of insurance. 
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The third period (‘insurance of the third kind’) can be characterized by the 
use on a large scale of financial instruments and financial engineering to reduce 
insurance risks. 

Mathematics of the insurance of the second kind is based on the Law of large 
numbers, the Central limit theorem, and Poisson-type processes. The theory of 
insuranee of the third kind is more sophisticated: it requires the knowledge of 
stochastic calculus, stochastic differential equations, martingales and related con- 
cepts, as well as new methods, such as bootstrap, resampling, simulation, neural 
networking, and so on. 

A good illustration of the above idea of the reasonableness and advantageousness 
of an integrated approach to actuarial and financial problems of securities markets is 
the efficiency of purely financial, ‘optional’ method in actuarial calculations related 
to reinsurance of ‘catastrophic events’ [87]. 

In the global market, one says that an accident is ‘catastrophic’ if ‘the damages 
exceed $5m and a large number of insurers and insured are affected’ (an extract 
from “Property Claims Services”, 1993). The actual size of the damages in some 
‘catastrophic cases’ is such that no single insurer is able (or willing) to insure against 
such accidents. This explains the fact that insurance against such events becomes 
not merely a joint but an international undertaking. 

Here are several examples of damages incurred by ‘catastrophic’ accidents. 

From 1970 through 1993 there occurred on average 34 catastrophes each year, 
with annual losses amounting to $2.5 billion. In most cases, a ‘catastrophic’ event 
brought damages of less than $ 250m. However, the losses from the hurricane ‘An- 
drew’ (August 1992) are estimated at $ 13.7 billion, of which only about $3 billion 
of damages were reimbursed by insurance. 

In view of large sums payable in ‘catastrophic’ cases: hurricanes, earthquakes, 
floods, the CBOT (Chicago Board of Trade) started in December of 1992 the futures 
trade in catastrophe insurance contracts as an alternative to catastrophe reinsur- 
ance. These contracts are easy to sell (‘liquid’), anonymous; their operational costs 
are low, and the supervision of all transactions by a clearing house gives one the 
confidence important in this kind of deals. Moreover, the pricing of such futures 
turned out in effect to reduce to the calculation of the rational price of arithmetic 
Asian call options (see the definition in § 1c). See [87] for greater detail. 


§3c. A Classical Example of Actuarial Calculations: 
the Lundberg-Cramér Theorem 


1. Oddly, around the same time when L. Bachelier introduced a Brownian mo- 
tion to describe share prices and laid in this way the basis of stochastic financial 
mathematics, Ph. Lundberg in Uppsala (Sweden) published his thesis “Approx- 
imerad framsrollning av sannolikhetsfunctionen. Atersforsakring av kollektivrisker” 
(1903), which became the cornerstone of the theory of insurance (of the second kind) 
and where he developed systematically Poisson processes, which are, together with 
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Brownian motions, central objects of the general theory of stochastic processes. 

In 1929, on initiative of several Swedish insurers, the Stockholm University es- 
tablished a chair in actuarial mathematics. Its first holder was H. Cramér and this 
marked the starting point in the activities of the ‘Stockholm group’, renowned for 
their results both in actuarial mathematics and general probability theory, statis- 
tics, and the theory of stochastic processes. 

We now formulate the classical result of the theory of actuarial calculations, the 
Lundberg -Cramér fundamental theorem of risk theory. 
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FIGURE 10. Risk process Rt 


We define the risk process (see Fig. 10) of, say, an insurance business as follows: 


where 
u is the initial capital, 
c is the rate of the collection of premiums, 
(£k) is a sequence of independent, identically distributed random variables 
with distribution F(x) = P{éi < z}, F(0) = 0, and expectation p = 
E£] < œ, 
N = (Ni)tz0 is a Poisson process: 


where insurance claims are submitted at the instants Ti, T2,... and it 
is assumed that (Tk+1 — Th)e51 are independent variables having expo- 
nential distribution with parameter A: 


P{Tk41-Th >t} =e ™ 
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Clearly, 


ER: = ut (c — Ap)t = u + pAut, 
where the relative safety loading p = c/(Ay) — 1 is assumed to be positive (the 
condition of positive net profit). 
One of the first natural problems arising in connection with this model is the 
calculation of the probability P(r < oo) of a ruin in general or the probability 
P(r < t) of a ruin before time t, where 


r= inf {t: Ri S 0}. 
THEOREM (Lundberg-Cramér). Assume that there exists a constant R > 0 such 
that 


cae Rr fe ae 
J e (1 -= F(z))dz = 5. 


Then 
P(T < œ) < ee 
where u is the initial capital. 


The assumptions in the Lundberg—Cramér model can be weakened, while the 
model itself can be made more complicated. For instance, we can assume that the 
risk process has the following form: 


Ni 
Ri = u+ (ct +o Be) — X ex, 
kel 


where (B+) is a Brownian motion and (N¢) is a Cox process (that is, a ‘counting 
process’ with random intensity; see, e.g., [250]). 

In conclusion, we briefly dwell on the question of the nature of the distributions 
F = F(x) of insurance benefits. As a matter of a pure convention, one often 
classifies insurance cases leading to payments with one of the following three types: 

e ‘normal’, 
è ‘extremal’, 
e ‘catastrophic’. 

To describe ‘normal’ events, one uses distributions with rapidly decreasing tails 
(e.g., an exponential distribution satisfying the condition 1 — F(x) ~ e7* as 
z> 00). 

One describes ‘extremal’ events by distributions F = F(x) with heavy tails; e.g., 
1— F(x) ~2-%, a > 0, as x > œ (Pareto-type distributions) or 


1- F(z) = exp{-(“—*)"}, x > ps, 


oO 
with p € (0,1) (a Weibull distribution). 

We note that the Lundberg-Cramér theorem relates to the ‘normal’ case and 
cannot be applied to large payments. (One cannot even define the ‘Lundberg co- 
efficient’ R in the latter case; for the proof of the Lundberg—Cramér theorem see, 
e.g., [439].) 
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1. Necessary Probabilistic Concepts and Several 
Models of the Dynamics of Market Prices 


§la. Uncertainty and Irregularity in the Behavior of Prices. 
Their Description and Representation in Probabilistic Terms 


1. Assume that we measure time in days n = 0,1, 2,..., and let 
S= (Sn)n>0 


be the market price of a share, or the exchange rate of two currencies, or another 
financial index (of unlimited ‘life span’, by contrast to, say, bond prices). An 
empirical study of the Sn, n > 0, shows that they vary in a highly irregular way; 


they fluctuate as if (using the words of M. Kendall; see Chapter I, §2a) “ ... the 
Demon of Chance drew a random number ... and added it to the current price to 
determine the neat... price”. 


Beyond all doubts, L. Bachelier was the first to describe the prices (Sn )n>0 using 
the concepts and the methods of probability theory, which provides a framework 
for the study of empirical phenomena featured by both statistical uncertainty and 
stability of statistical frequencies. 

Taking the probabilistic approach and using A. N. Kolmogorov’s axiomatics of 
probability theory, which is generally accepted now, we shall assume that all our 
cousiderations are carried out with respect to some probability space 


(Q, F,P), 


where 


Q is the space of elementary events w (‘market situations’, in the present 
context); 

F is some o-algebra of subsets of Q (the set of ‘observable market events’); 

P is a probability (or probability measure) on F. 
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As pointed out in Chapter I, § la, time and dynamics are integral parts of the 
financial theory. For that reason, it seems worthwhile to define our probability 
space (Q, F, P) more specifically, by assuming that we have a flow F = (Fn)n>0 of 
o-algebras such that 


Fy CF, C++ C Gn C++ CF. 


The point in introducing this flow of nondecreasing o-subalgebras of F, which is also 
called a filtration, becomes clear once one have accepted the following interpretation: 
Fn is the set of events observable through time n. 

We can express it otherwise by saying that Fn is the ‘information on the market 
situation that is available to an observer up to time n inclusive. (In the framework 
of the concept of an ‘efficient’ market this can be, e.g., one of the three g-algebras 
Fl, F2, and F3; see Chapter I, § 2a.) 

Thus, we assume that our underlying probabilistic model is a filtered probability 
space 


(Q, F, (Fn)n>0, P), 


which is also called a stochastic basis. 

In many cases it seeins reasonable to generalize the concept of a stochastic basis 
by assuming that, instead of a single probability measure P, we have an entire 
family & = {P} of probability measures. (The reasons are that it is often difficult 
to single out a particular measure P.) Using the vocabulary of statistical decision 
theory we can call the collection (Q, F, (Fn)ns0, P) a filtered stochastic (statistical) 
experiment, 


2. Regarding Fn as the information that had been accessible to observation through 
time n, it is natural to assume that 


Sn is ¥n-measurable, 


or that (using a more descriptive language), prices are formed on the basis of the 
developments observable on the market up to time n (inclusive). 

Bearing in mind our interpretation of Sn as the ‘price’ (of some stock, say) at 
time n, we shall assume that Sn > 0, n > 0. 

We now present the two most common methods for the description of the prices 
S= (Sn) n>0- 

The first method, which is similar to the formula for compound interest (see 
Chapter I, §1b ) uses the representation 


Sn = Soe”, (1) 


where Hn = ho + hy +:++++hp with ho = 0 and the random variables hn = hn(w), 
n > 0, are ¥,-measurable. Hence 


Hn = In (2) 
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and the ‘logarithmic returns’ can be evaluated by the formula 


AS. 
hn = Tn =In(1+ e), (3) 
n—1 Sn-1 
where AS, = Sn — Sp—1- 
We now set AS 
lin = o and n= J, hp. (4) 
n—l1 1<k<n 
Then we can rewrite (1) as 
Sn= So [| (+h), (5) 
l<ksín 
or, equivalently, as 
Sn = So [| (+ AH) = Soe [| (1+ Ame a i. (6) 
l<gk<n lgk<n 


The representation (5) is just the second method of the description of prices. It 
is equivalent to the ‘simple interest’ formula. 
Let &(H),, be the expression on the right-hand side of (6): 
E(A)n =e [J (14+ Ame, (7) 


l<gkcn 
We call the stochastic sequence 


6(H) = (ED) n) nso j &(H)o =1, 
defined by this expression the stochastic exponential generated by the variables 
H = (An)n>0, Ho = 1, or the Doléans exponential. 

Thus, we cau say that the first method for the description of prices uses the 
usual exponential 


Sn = Soe”, 
while the second method involves the stochastic exponential: 
Sn = So6(H)n, (8) 
where 


A, = D (ek 4), 


l<gk<n 
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which is equivalent to the following representation: 


Hn =Hn+ X. (e7 - AH, - 1). (9) 


l<gk<n 


It is also clear from (3) and (4) that 


Hn= X` In(1+Ad,), (10) 


l¢gk<n 


where An = An > —1 by our assumption that Sp > 0. 
It is worth noting that the stochastic exponential satisfies the stochastic differ- 
ence equation 


A&(A)n = 6(H)n—1 An, (11) 
which is an immediate consequence of (7). 


Remark 1. Formulas (1) and (8) are related to discrete time n = 0,1,.... In 
the case when the prices S = (5St)¢y09 evolve in continuous time ¢ > 0 one usu- 
ally assumes that the processes H = (Ht)t>0 and H = (A)150 are semimartin- 
gales (see Chapter III, §5b). Then by the Itô formula (Chapter III, § 5c, see also 
[250; Chapter I, § 4e]) we obtain 


e™ = (f), 


where i 
Be= Hit s(t D7 (eAM* —1— AH) (12) 
O<sct 


aud &(H) = (E(P) 50 is the stochastic exponential: 


satisfying (cf. (11)) the linear stochastic differential equation 
d&(H), = 6(A),_- di. (14) 


(In (12) and (13) we denote by (H°) and (H°) the quadratic characteristics of the 
continuous martingale components of the semimartingales H and Ĥ; see Chap- 
ter III, §5a. As regards stochastic differential equations in the case when Hisa 
Brownian motion, see Chapter ITI, § 3e.) 
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Thus, we have the following representations that are the counterparts of (1) 
and (8), respectively, in the continuous-time case: 


St = Soe! (15) 
and 
St = Sogl f), (16) 


where the process H = (Ai)e0 is related to H = (Ht)tz0 by (12). The series 

in (12) is absolutely convergent because, with probability one, a semimartingale 

makes ouly finitely many ‘large’ jumps (|AH,| > 3) on each interval [0,¢] and 
D |AH,|? <œ. (See Remark 3 in Chapter III, § 5b.) 

O<sct 


Remark 2. By (3) and (4), 


hn = In(1 + hn) (17) 
and 
hin = e" — 1, (18) 
Clearly, we have 
hn & hn (19) 
for small values of hy, moreover, 
~ 1 1 
hn — hin = shin + ghat (20) 


3. We uow dwell on the problem of the description of the probability distributions 
for the sequences S = (Sp)n 0 and H = (Hn)nzo. 

Taking the viewpoint of classical probability theory and the well-developed ‘sta- 
tistics of normal distribution’ it would be nice if H = (Hn)nz0 could be a Gaussian 
(normally distributed) sequence. If we set 


Hn = hy tees thn, n >l, (21) 


then the properties of such a sequence are completely determined by the two- 
dimensional distributions of the sequence h = (hn)n>1, which can be characterized 
by the expectations 

Ln = Ehn, n >l, 
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and the covariances 
Cov(hn, hm) = Ehnhm — EhnEhm, mnl. 
This assumption of normality can considerably facilitate the solution of many 
problems relating to the properties of distributions. For instance, the ‘Theorem on 


normal correlation’ (see, for example, [303; Chapter 13]) delivers an explicit formula 
for the conditional expectation hn41 = E(kn+1 |h1,---, An), which is the optimal 


(in the mean-square sense) estimator for hn+1 in terms of hy,...,hn. Namely, 
~~ n 
Anti = Hn41 + ai(hi — pi), (22) 
i=1 


where the coefficients a; are evaluated in terms of the covariance matrix (see [303; 
Chapter 13] and also [439; Chapter II, § 13]). 


Formula (22) becomes particularly simple if h1,..., hn are independent. In this 
case 
se Cov(hn41, hi) 
t Dhi , 
and we obtain the estimator 
n 
S Cov(hn+1, hy 
An+1 = Ehnti + 5 Corlan M e, =. Ehi). (23) 
i=l i 


The estimation error 


An+1 = E(hn+1 a hn41)? 


can be expressed by the formula 


n 2 
Cov* (hn+1, Ri 
Anti =Dhn4i- >, Ceri i) (24) 
i=1 ? 
We note that if 
1 _ (z-z)? 


Por) = a 


is the density of the normal distribution with parameters (4,07), then (see Fig. 11) 


puto 
J Piu a2) (2) dz = 0.6827.... 
u 


—o 
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FIGURE 11. Graph of the density P(o 1)(7) of the standard normal 
distribution. The area of the shaded part is approximately 0.6827 


In the same way, 


u+1.650 
yp (x) dx = 0.90. (25) 
= (m0?) 


By the Gaussian property we obtain 
hn+1 _ An41 oat N (0, An+1), 


therefore = 
P{{hn+1 ~ Anti] S 1-65./An4i } ~ 0.90 


by (25). Hence we can say that the expected value of hn+1 lies in the confidence 


interval 2 7 
[hn41 — 1.65 VAn+1, Angi + 1-65\/An+ | 


with probability close to 0.90. This means that, in 90% cases the predicted value 


Sn+1 of the market price Sn+1 (calculated from the observations hy,..., An) lies in 


the interval $ 7 
[ Speltnti~165V And Spehntitl65V Anti |. 


4. However attractive, this conjecture on the ‘normality’ of the distribution of the 
variables hn, n > 1, must be taken with caution, because the empirical analysis of 
much financial data shows (see Chapter IV) that 

(a) the number of values in a sample that he outside the ‘confidence’ intervals 

7 any s : z Fespa 
[hn — kn, hn + ken] with k = 1,2,3 (here hn = — >> h; is the sample mean and 
i=1 

On, is the standard deviation: 


ly > 
A2 2 
Gn = ——z 2 (hi ~ Fin)”) 
n-1¢ 
i=1 
is considerably larger than it should be if this conjecture were true. More geomet- 
rically this means that the ‘tails’ of the empirical densities are ‘heavy’, i.e., they 
decrease at a much lower rate than it should occur for Gaussian distributions; 
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(b) the kurtosis 


(here Mz and M4 are the empirical second aud fourth moments) is markedly posi- 
tive (the kurtosis of a norinal distribution is equal to zero), which shows that the 
distribution density has a high peak around the mean value (is leptokurtotic): 


30+ 


207 


10+ 


— 0.006 —0.001 0.004 


Figure 12. Empirical density of the one-dimensional distribution 
of the variables hn, n < 300, governed by the HARCH(16) model 
(see § 3b). The continuous curve is the density of the corresponding 
normal distribution N(m,o?) with m = h3oo and o = Gand 


Arguably, the most strong assumption about the general properties of the dis- 

tribution of the hn, apart from the Gaussian property, is the following one: 

these variables are independent and 
identically distributed. 

Under this assiunption it is easy to carry out the analysis of the prices Sn = Soettn , 
where Hn = hy +++ hn, by the standard methods of probability theory designed 
expressly for such situations. It is clear, however, that the assumption of the inde- 
pendence of the hn undermines all the hopes that the ‘past’ information could be 
of any use in the prediction of the ‘future values’. 

In practice, the situation is more favorable, since numerous studies of financial 
time series enable one to say that, as mentioned already, the variables (hn) are non- 
Gaussian and dependent, although they can be uncorrelated and their dependence 
can be rather weak. The casiest way to demonstrate certain degree of dependence 
is to consider the empirical correlations for the hn! or i, rather then for the hn. 
(In the model of stochastic volatility discussed below we have Cov(hn, hm) = 0 
for n # m, but the values of Cov(h2,h2,) and Cov(|hnl, |hm!) are far from zero; 
see § 3c.) 
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§1b. Doob Decomposition. Canonical Representations 


1. We shall assume that the variables hn, n > 1, in the model 
Sn = Soe", Hy = htt: +hn, (1) 


have finite absolute first moments, i.e., EAn! < œ for n > 1. 

The Doob decomposition that we discuss later in this section indicates that one 
should study the sequence H = (Hp) in its dependence on the properties of the 
filtration (Fn), i.e., of the flow of ‘information’ F, accessible to an ‘observer’ (of 
the securities market, in the context adopted here). 

Since Ejkn| < co for n > 1, the conditional expectations E(hn | Fn—1) are well 
defined and (Ho = 0, Fo = {2,Q}) 


Hn = > E(hp | Fk-1) + J [hr — Elhe | Fk-1)]. (2) 


kín ksn 


In other words, setting 


An = >> E(k | Fk-1), (3) 
kxn 
and 
Mn = X [hk — E(hr | Fx—1)] (4) 
kín 


we obtain the following Doob decomposition for H = (Hp): 
Hn = An + Mn, n>1, (5) 


where 
a) the sequence A = (An)n 0, where Ao = 0, is predictable, i.e., 


Ay are ¥,~j-measurable, n> 1; 
b) the sequence M = (Mn)n>0, where Mo = 0, is a martingale, i.e., 
E(Mn | Fn—1) = Mn-1 (P-a.s.), n >l, 


the Mn are Fn-measurable and E!Mp! < co for n > 1. 
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Remark. Assume that, besides the filtration (Fn), we have a subfiltration (Sn), 
where , C Fn and G, C G,41. Then we can decompose H = (Hp) with respect 
to the flow (Sn) in a similar way to (5): 


Hn = D> E(hn|Ge—1) + J (hr — Elhp | Ge—1)). 
k=1 k=l 


The sequence A = (An) of the variables 


n 


An =J E(he | Ik—1) 


i=1 


is (G, )-predictable (i.e., the An are %,—1-measurable), but M = (Mn), where 


| 


2 (hre — E(x |Sk-1)) 


is, generally speaking, not a martingale with respect to (G,) since the hy, are mea- 
surable with respect to the Fp, but not necessarily with respect to the gp. 
It must be pointed out that if, besides (5), we have another decomposition, 


H, = Ai, + M}, 


where the sequence A’ = (Ap, Fn) is predictable (with respect to the flow (Fn)), 
Ap = 0, and M’ = (Mj, Fn) is a martingale, then A}, = An and Mj, = Mn for all 
n>o0. 

For we have 


Ala. — An = (Anti — An) + (Mn4i — Mn) — (Mh+1 — Mh) 


Hence, considering the conditional expectations E( | Fn) of both sides we see that 
An+1 4n = An+1 — An (since the A}, and An+1 are Fn-measurable). However, 
Ao = Ao = 0, therefore Ai, = An and Mj, = Mn for all n > 0. Hence the 
decomposition (1) with predictable sequence A = (An) is unique. 

We note also that if E(hy | Fk—1) = 0 for k > 1 in the model under consideration. 
then the sequence H = (Hp) is itself a martingale by (2). 

We now present an example of the Doob decomposition, which shows clearly 
that, for all its simplicity, it is ‘nontrivial’. 


EXAMPLE, Let Xo = 0 and Xn = €;+---+&n,n > 1, where the £n are independent 
Bernoulli variables such that 


P(En = +1) = 
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We consider the Doob decomposition for Hp = 0 and Hn = |Xni, n > 1. 
In this case we have 


hn = AHn = A| Xn! = [Xn] = Xn] = |Xn-1 + nl - (Xn-al, 
and it is clear that 
AMn = hin — E(hn | Fn—1) = |Xn—1 + Enl — E(|Xn=1 + €nl | Fn-1) 
= |Xn-1 + én] — E(/Xn-1 + &nl | Xn—1) 
= (Sgn Xn-1)n, 


where 
1, z>0, 
Sgn z = 0, z=0, 
-l, r<0. 


Thus, the martingale M = (Mn)n>1 in (5) is now as follows: 


Mn = 5 (Sgn Xk-1)AXk. 
l<ksn 
Further, 
E(hn | Fn—1) = E(|Xn—1 + En| | Xn—1) = |Xn-1l- 
The right-hand side vanishes on the set {w: Xn-1 = i} with ¿ Æ 0, while if i = 0, 
then it is equal to one. Hence 
n 


E(hy | Fi-1)= N{1 <k <n: Xk-1 = Of, 


= 


q= 

where N{1 < k < n: Xk-1 = O} is the number of k, 1 < k < n, such that 
Xk-1 = 0. 

Let Ln(0)= N{0 <k <n-—1: Xk = 0} be the number of zeros in the sequence 
(Xgock<n—1. Then 

[Xn] = 5 (Sgn Xk-1)AXk + Ln(0), 
l<¢kgn 

which is a discrete analogue of the well-known Tanaka formula for the absolute 
value of a Brownian motion (see Chapter III, §5c). Hence, in particular. 


EL, (0) = E|Xn]. 
i, Xn 2 
Since Ja ~ N (0,1), it follows that E| Xn| ~ z’ and therefore 


n 
2 
ELn(0) ~ 4/ =n. 
T 


This is a known result of the average number of zeros in a symmetric random 
Bernoulli walk (see, e.g., [156]). 


92 Chapter II. Stochastic Models. Discrete Time 


2. Let M = (Mn)n>1 be a square integrable martingale (EM? < oc, n 21) with 
Mo = 0. Then setting Hn = MŽ in the decomposition (2) we obtain the following 
representation: 


Mi= X E(AMÈ!Fr-1)+ XO (AMÈ -E(AMÈ | Fu-1)), (6) 
I1<kgn l<k<sn 
where AMP = MP = Mè. 
We now set 


(M)n= JO E(AMÈ!Fk-1), 


l<ksn 


(7) 
mn = X` (AMÈ -E(AMÈ | Fk-1)). 
l¢gk<n 
Using this notation we can rewrite (6) as follows: 
My = (M)n + mn; (8) 


the (predictable) sequence (M) = ((M)n) ast is called here the quadratic charac- 


teristic of the martingale M (cf. Chapter III, § 5b). 
We note that since M = (Mn) is a martingale, it follows that 


E(AMg | Fx—1) = E( (AM4)? | Fp-1). (9) 


This property explains why one often calls the quadratic characteristic (M) also 
the predictable quadratic variation of the (square integrable) martingale M. In that 
case one reserves the term quadratic variation for the (in general, unpredictable) 
sequence [M] = ([M]n)n>1 of the variables (cf. Chapter III, § 5b) 


[M]n = X (AM,)?. (10) 


ken 


3. We now assume that the sequence H = (Hn) is itself a martingale, and even a 
square integrable one, i.e., E(A AH, | Fe) = E(he | Fk-1) = 0, Eh? <o,k 21. 
Then 

(H)n = >> E(AR | Fe—1)- (11) 


kín 


The variables E(h2 | Fk—1) contributing to the quadratic characteristic (H),, define 
the volatility of H and, in large measure, its properties. For example, if (Hẹn — oo 
with probability one, then the square integrable martingale H satisfies the strong 
law of large numbers: 
Hy, 
Hn 


as n + œœ. (See [439; Chapter VII, §5].) 


+0 (P-as.) (12) 
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The collection of variables (E(h2 | Fit) Vos 
subsequent analysis of ‘financial’ time series S = (Sn) with Sn = Soe“. In our 
considerations of these series, using the vocabulary of the financial theory we shall 
call the sequence (E(n2 | Fe—1)) ps1 the stochastic volatility. (See §3 for a closer 


will play an important role in our 


look at volatility.) 

If the conditional expectations E(h? | Fk) coincide with the unconditional 
ones (e.g.. if (hn) is a sequence of independent random variables and Fk] = 
o(hy,..., hg) is the o-algebra generated by h1,...,hy—1), then the volatility is 
a mere collection of variances of = En?, k > 1, that are the standard measures for 
the dispersion (changeability) of the hy, (here we assume that Eh, = 0). 


4. In deducing the Doob decomposition (2) or (5) we assumed that E]h,| < œ for 
k > 1. Actually, we required this assumption only to ensure that the conditional 
expectations E(h, | Fk—1) > 1, were well defined. Thus, there arises a natu- 
ral idea of using the decomposition (2) also in the (more general) case when the 
conditional expectations E(hy | Fk—1) are well defined and finite and the condition 
Elh,| < œ% is not necessarily satisfied. 

To this end we recall that if E/h,| < oo, then the conditional expectation 
E(hy,| Fk—1) is (according to A. N. Kolmogorov) the ¥,_,-measurable random 
variable such that 


| E(u | Fx-1) AP = f hy dP (13) 
JA A 


for each A € Fk. The existence of such a variable is a consequence of the 
Radon-Nikodym theorem (see, e.g., [439; Chapter II, § 7]). 

Note, however, that the condition Ejk] < œ is not at all necessary for the ex- 
istence of an ¥%,,_.1-measurable variable E(h, | Fk—1) satisfying (13). For instance, 
if ky, 2 0 (P-a.s.), then we can define it without assuming that Eh, < oo. Hence 
the idea of the following definition of a generalized expectation, which we shall also 
denote by E(hg | Fk—1)- 

We represent hy, as the following sum: 


hk = hy — hy, 


where hy = max(h;,0) and hy = —min(h,,0). We assume that we have already 
defined E(h$ | Fy1)(w) and E(h, | Fk—1)(w) so that 


min{E(hy | F,—1)(w), E(hy | Fr—1)(w) } < 00 (14) 
for all w € Q. Then we set 
E( hy, | Fy—1)(w) = E(h} | Fp—1)(w) — E(ag | Fe—1)(w) (15) 


(here and throughout “=” means by definition), and we call E(hy | Fk—1) the gen- 
eralized conditional expectation. 
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If Elh! < œ, then this generalized expectation is clearly the same as the usual 
conditional expectation. 

IfE(|he! | Fk-1)(w) <œ for w€Q, then (14) obviously holds and E(hy | Fy) (w) 
is not merely well defined, but also finite for each w € Q. In this case we say that 
the generalized conditional expectation E(hy | Fk—1) is well defined and finite. 


Remark. Proceeding in accordance with the general spirit of probability theory, 
which usually puts weight on the verification of particular properties ‘for almost 
allw € Q, rather then ‘for all’ of them, we can easily construct an ‘almost-all’ 
version of the above definition of the generalized conditional expectation by setting 
E(hk |. F,—1)(w) to be arbitrary on the zero-probability set where (14) fails. 

We now consider the representation (2). The right-hand side of (2) is surely 
well defined if E([h}| | Fk-1) < co for k > 1 (and for all or almost all w € Q). In 
that case we shall say that (2) is a generalized Doob decomposition of the sequence 
H = (Hp)nz1- 


5. We now discuss a similar decomposition (or, as we shall also call it, a represen- 
tation) in the case when the conditional expectations E(h, | Fx—1) (either ‘usual’ 
or generalized) are not defined. Then we can proceed as follows. 

We represent hy, as a sum: 


hy = Agl(lhg] <a) + hI (lhk! > a), 


where a is a certain positive constant (one usually sets a = 1) and I(A) (we shall 
also write I4 or I4(w)) is the indicator of the set A (i.e., [4(w) = 1ifw € A and 
Ialw) = 0 otherwise). 

Now, the variables hyJ(|h,| < a) have well-defined first absolute moments, there- 
fore 


HIS) = Y hl (hy! Sa) 
l<k<n 
= LS E(hyl(\he| < a) | Fk-1) 
l<ek<n 
+ DO [helhe] <a) E(hkI(lhk] < a)l Fr-1)] 
l<gk<n 
(= AS? + mfS%). (16) 
Hence 
Hn = ASS®) + MAS + YO hy (lel >a), (17) 
l<gk<cn 
h (<a) i dictabl MiSs i ingale, and 
where (An hisi is a predictable sequence, (Mn ee is a martingale, an 


SD gl ([hg| > a)) is the sequence of ‘large’ jumps. 
l<gk<n n>l 
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Using the vocabulary of the ‘general theory of stochastic processes’ (see Chap- 
ter III, §5 and [250; Chapter I, §4c]) we call (17) the canonical representation 
of H. 

We note that if, besides (17), we have another representation of H in the form 


Hy = An +Myt YO hy l(|hel > a) (18) 
l<gk<n 


with predictable sequence (A/J,) and martingale (M/,), then, of necessity, A’, = ALS) 


and M! = M{S). 
In other words, there exists a unique representation of the form (18). This 
justifies the name canonical given to this representation (17). 


§1c. Local Martingales. Martingale transformations. 
Generalized Martingales 


1. In the above analysis of the sequence H = (Hn) based on the Doob decom- 
position (5) and its generalization these were the concepts of a ‘martingale’ and 
‘predictability’ (and, accordingly, the martingale M = (Mn) and the predictable 
sequence A = (An) involved in the representation of H) that played a key role. 

This explains why one often calls the subsequent stochastic analysis martin- 
gale or stochastic calculus, meaning here analysis in filtered probability spaces, i.e., 
probability spaces distinguished by a special structure, a flow of o-algebras (Fn) 
(a ‘filtration’). It is precisely with this structure that stopping times, martingales, 
predictability, sub- and supermartingales, and some other concepts are connected. 

Perhaps, an even more important position than that of martingales in the mod- 
ern stochastic calculus is occupied by the concept of local martingale. Remarkably, 
local martingales form a wider class than martingales, but retain many important 
properties of the latter. 

We now present several definitions. 

Let (Q, F, (Fn), P) be a stochastic basis, i.e., a filtered probability space with 
discrete time n > 0. 


DEFINITION 1. We call a sequence of random variables X = (Xn) defined on the 
stochastic basis a stochastic sequence if Xn is ¥p-measurable for each n > 0. 


To emphasize this property of measurability one writes stochastic sequences as 
X = (Xn. Fn), thus incorporating in the notation the o-algebras Fn with respect 
to which the X,, are measurable. 


DEFINITION 2. We call a stochastic sequence X = (Xn, Fn)n>0 
a martingale, 
a supermartingale, 
a submartingale 
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if ELX,,| < co for each n > 0 and if (P-a.s.) 


E(Xn | Fn-1) =. Ce 
E(Xn | Fn-1) S Xni; 
E(Xn | Fn—1) 2 Xri=l; 


respectively, for all n > 1. 


Clearly, EX, = Const (= EXọ) for a martingale, the expectations are non- 
increasing (EX, < EXņn-1) for a supermartingale, and they are nondecreasing 
(EXn > EXn—1) for a submartingale. 

A classical example of a martingale is delivered by a Lévy martingale X = (Xn) 
with Xn =E(E| Fn), where £ is a #-measurable random variable such that EJE] < oo. 

This is a uniformly integrable martingale, i.e., the family {Xn} is uniformly 
integrable: 

sup E(|Xp|Z(|Xn| >C)) 40 as C >o. 
n 


In what follows, we denote by yy the class of all uniformly integrable martin- 
gales. We denote the class of all martingales by M. 

In the case when the martingales in question are defined only for n < N < œ, 
the concepts of a martingale and of a uniformly integrable martingale are clearly 
the same (M = Myr). 

Sometimes, when there is a need to point out the measure P and the flow (Fn) 
with respect to which the martingale property is considered, one also denotes the 
classes My, and M by Myj(P,(¥n)) and M(P, (Fn)). 


DEFINITION 3. We call a stochastic sequence © = (£n, Fn)n>1 with Elan| < œ a 
martingale difference if (here we set Fo = {@,Q}) 


E(tn|Fp—1)=0 (P-a), n>. 


Clearly, if £ = (£n) is such a sequence, then the corresponding sequence of sums 
X = (Xn, Fn), where Xn = Xo +41 ++: + £n, is a martingale. Conversely, 
with each martingale X = (Xn, Fn) we can associate the martingale difference 
x= (fn, Fn) with £n = AX,, where AXn = Xn — Xn-1 for n > 1 and AXọ = Xo 
for n = 0. 
DEFINITION 4. We call a stochastic sequence X = (Xn, Fn) a local martingale 
(submartingale, supermartingale) if there exists a (localizing) sequence (Tk)k>1 of 
Markov times (i.e., of variables satisfying the condition {w: Tk S n} E€ Fn, n È 1; 
see also Definition 1 in § 1f) such that Tẹ < Tk+1 (P-a.s.), Tk T o0 (P-a.s.) as k + 00, 
and each ‘stopped’ sequence 


X™*= (Xeni Fn) 


is a martingale (submartingale, supermartingale). 
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Remark 1. One often includes in the definition of a local martingale the requirement 
that the sequence X7* be not merely a martingale for each k > 1, but a uniformly 
integrable martingale (see, e.g., [250]). We note also that sometimes, intending to 
consider also sequences X = (Xn, Fn) with nonintegrable ‘initial’ random value Xo, 
one defines stopped sequences X7* in a somewhat different way: 


X™ = (XrpanI (Tk > 0), Fn) 


We shall write Mig. or MioclP, (Fn)) for the class of local martingales. 
By Definition 4, each martingale is a local martingale, so that 


M C Moc 
If X € Moc and the family of random variables 


E= {X;: 7 is a finite stopping time} 


is uniformly integrable (ice. sup E{|X;|I(|X;| > C)} > 0 as C > o), then 
XEL 


X is a martingale (X € M); moreover, it is a Lévy martingale: there exists an 
integrable, F-measurable random variable X such that Xn = E(Xoo | Fn). Thus, 
X € My in this case. (See [250; Chapter 1, § 1e] or [439; Chapter VII, § 4] for 
greater detail.) 


DEFINITION 5. We say that a stochastic sequence X = (Xn, Fn)n>0 is a gener- 
alized martingale (submartingale, supermartingale) if E| Xo| < oo, the generalized 
conditional expectations E( Xp, | Fn—1) are well defined for each n > 1 and 


E(Xn|Fn-1)=Xn-1  (P-a.s.) 


(accordingly, E(Xn | Fn—1) 2 Xn-1 or E(Xn | Fn—1) KS Xn-1)- 


Remark 2. By the definition (see § 1b) of the generalized expectation E(Xn | Fn—1) 
and the ‘martingale’ equality E(Xn]| Fn-1) = Xn—1 we obtain automatically 
that E(|Xn!|¥%n—1) < co (P-a.s.). This means that the conditional expectation 
E(Xn | Fn—1) is not merely well defined, but also finite. Hence we can assume 
in Definition 5 that E(|Xn||¥%n—1) < œ (P-a.s.) for n > 0. 


DEFINITION 6. We call a stochastic sequence x = (£n, Fn)n>1 a generalized mar- 
tingale difference (submartingale difference, supermartingale difference) if the gen- 
eralized conditional expectations E(£n |. Fn—1) are well defined for each n > 1 and 


E(tn|Fn—1)=0 (P-as.) 


(respectively, if E(xn | F%n—1) > 0 or E(tn | Fn—1) < 0 (P-a.s.)). 


Remark 3. As in Remark 2, it can be requested in the definition of a generalized 
martingale difference that E(l£n!| Fn—1) < œ (P-a.s.) for n > 1. 
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DEFINITION 7. Let M = (Mn, Fn) be a stochastic sequence and let Y = (Yn, Fn—1) 
be a predictable sequence (the Yp are ¥,\-measurable for n > 1 and Yo is 
Fo-measurable). 

Then we call the stochastic sequence 


Y-M=((Y¥-M)n,Fn), 


where 
(Y-M)n=Yo-Mo+ S| YAM, 
l<k<n 
the transformation of M by means of Y. If, in addition, M is a martingale, then 
we call X = Y - M a martingale transformation (of the martingale M by means of 
the (predictable) sequence Y). 


The following result shows that the concepts introduced by Definitions 4, 5, 
and 7, are closely related in the discrete-time case. 


THEOREM. Let X = (Xn, Fn)n>0 be a stochastic sequence with E| Xo] < œ. Then 
the following conditions are equivalent: 
(a) X is a local martingale (X € Moc); 
(b) X is a generalized martingale (X € GM); 
(c) X is a martingale transformation (X € MT), ie., X = Y -M for some 
predictable sequence Y = (Yn, ¥n—1) and some martingale M = (Mn, Fn). 


Proof. (c)= (a). Let X € MT and let 


n 
Xn = Xo + X ¥,AMy, (1) 
k=1 


where Y is a predictable sequence and M is a martingale. If |Y,| < C for k > 1, 
then X is clearly a martingale. Otherwise we set 7; = inf{n — 1: |Yn| > j}. Since 
the Yn are ¥,—1-measurable, the 7; are stopping times, Tj T œ as j — œ, and the 
‘stopped sequences’ X7 are again of the form (1) with bounded YS = Y,I{k S Trj} 
Hence X € Moc- 

(a)= (b). Let X € Moc and let (Tk) be the corresponding localizing 
sequence. Then E|X;¥ | < œ and E(|Xn41]| Fn) = E(\X74.1|| Fn) on the set 
{Tk > n} € Fn. Hence E(|Xn41!! Fn) < œ (P-a.s.). 

In a similar way, on the same set {Tp > n} we have 


E(Xn+1 | Fn) = E(X7* 


iE ee Cane cs 


so that X € GM. 
(b)= (c). Let X € GM. We set 


An(k) = {w: E(Xn4i}| Fn) € [kk + 1}. 
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Then 
Un = Sok + 1)FAXnl 4, a(k) 
k>0 
is a F -measurable random variable with E(un|F%p,—1) = 0. Hence Mn = 3 u; is 
a martingale (we set Mo = 0) and (1) holds for Y = (Yn), where i=l 
Y= Sok + 1) TAn_1(k)> 
k>0 


so that X € AT. 


2. We shall demonstrate the full extent of the importance of the concepts of a local 
martingale, a martingale transformation, and a generalized martingale in financial 
mathematics in Chapter V. These concepts play an important role also in stochastic 
calculus, which can be shown, for example, as follows. 

Let X = (Xn, Fn)n>z0 be a local submartingale with localizing sequence (Tp), 
where 7 > 0 (P-a.s.). Then for each k we obtain the following decomposition for 
X(T) S (Xnarg Fn): 

Xna, = AW? + MKP 


with predictable sequences (AS) n0 and martingales (ME n0. 
This decomposition is unique (provided that the sequences (Am) 50 are pre- 
dictable), therefore it is easy to see that 
Ales) = ALTE), 
Setting An = Alt) for n < Tk we see that (Xn — An)nyo is a local martingale 
because the ‘stopped’ sequences 
(xf) z AOT an 


are martingales. 
Hence if X = (Xn, Fn)nz0 is a local submartingale, then 


Xn = An + Mn, n >20, (2) 


where A = (An) is a predictable sequence with Ag = 0, and M = (Mn) is a local 
martingale. 

It should be noted that the sequence A = (An) is increasing (more precisely, 
nondecreasing) in this case, which is a consequence of the following explicit formula 
for the ALT’), 

AY*) = $O E(AX;| F-1), 
i[LNATk 
and of the submartingale property E(AX; | F;—1) > 0. 

We note also that the decomposition (2) with predictable process A = (An) is 

unique. 
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DEFINITION 8. Let X = (Xn, Fn) be a stochastic sequence admitting a repre- 
sentation X;, = An + Mn with measurable sequence A = (An, Fn—-1) and local 
martingale M = (Mn, Fn). Then we say that X admits a generalized Doob decom- 
position and A is the compensator (or the predictable compensator, or else the dual 
predictable projection) of the sequence X. 

(We call A a ‘compensator’ since it compensates X to a local martingale.) 
3. For couclusion, we present a simple but useful result of [251] describing condi- 
tions eusuring that a local martingale is a (usual) martingale. 
LEMMA. 1) Let X = (Xn, Fn)nzo be a local martingale such that E|Xo| < co and 
either 


EX, <%, aS 0, (3) 
or 
EX, < œ, n > o0. (4) 
Then X = (Xn, Fn)nz0 is a martingale. 
2) Let X = (Xn, Fn)ogngn be a local martingale and assume that N < oo, 
E| Xo] < 00, and either EX, < 00, or EX}; < 00. Then (3) and (4) hold for each 
n < N and X = (Xn, Fn)ogngn is a martingale. 
Proof. 1) We claim that each of conditions (3) and (4) implies the other and, 
therefore, the inequality E| Xn! < œ, n > 0. 
For if, say, (3) holds, then by Fatou lemma ([439; Chapter II, § 6]), 


EX} =E toe < limEX fan, = im[EXnan, + EX iar] 
k 


NATE 


= EXo + lim EX pAr, < |EXo] + 3 EX7 < œ. 
i=0 
Consequently, E| Xn] < œ, n > 0. 


Further, we have |X(nsijary! S > |X;| with E js) |X;| < oo, therefore by 


i= 
Lebesgue theorem on dominated G ([439; Chapter II, §§ 6 and 7]), passing 
to the limit (as k — oo) in the relation 
E(X,, (nti)! (te > 0)| Fn) = Xan 
we obtain E(Xn41! Fn) = Xn, n > 0. 

2) We note that if EX,, < oo, then also EX, < œ for n < N, because local 
martingales are generalized martingales. Hence Xn = E(Xn+1 | Fn), and therefore 
Xn SE(X, 1 | Fn) and EX, SEX, SEXy for alln S N-1. 

Thus, it follows from 1) that the sequence X = (Xn, Fn)ogngy is a-martingale. 

In a similar way we can consider the case of EX < œ thus completing the 
proof of the lemma. 
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COROLLARY. Each local martingale X = (Xn )n>0 that is bounded below (inf Xp(w) 
n 


2C >-%, P-a.s.) or above (sup Xn(w) [<C< œ, P-a.s.) is a martingale. 
n 


4. Itis instructive to compare the result of the lemma with the corresponding result 
in the continuous-time case. 

If (Q, F, (Ft)tz0, P) is a filtered probability space with a nondecreasing family 
of o-algebras (Fi)ts0 (Fs C Fa C F, s < t), then we call a stochastic pro- 
cess X = (Xt)t>0 a martingale (a supermartingale, a submartingale) if the Xz are 
F,-measurable, ELX;| < co for t > 0, and E(X! Fs) = Xs (E(Xt| Fs) < Xs or 
E(X | Fs) > Xs) for s <S t. 

We call the process X = (X¢)¢>0 a local martingale if we can find a nondecreasing 
sequence of stopping times (Tk) such that Tg t co (P-a.s.) and the stopped processes 
XT = (Xtar I (Th > 0), Fa) are uniformly integrable martingales for all k. 

Using the same arguments as in the proof of the lemma, which are based on 
Fatou’s lemma and Lebesgue’s theorem on dominated convergence ([439; Chap- 
ter II, §§ 6 and 7]) we can prove the following results. 


I. Each local martingale X = (X+t)t>0 satisfying the condition 


EsupX; < œ, t>0, 
s<t 


is a Supermartingale. 
II. Each local martingale X = (X¢)¢50 satisfying the condition 


Esup|Xs| < œ, t >20, 
sst 


is a martingale. 
IHI. Each local martingale X = (Xt)tz0 satisfying the condition 


E sup |X5| < œ, t20, 


s< 


is a uniformly integrable martingale. 


It is useful to bear in mind that, in the discrete-time case, if EX7 < oo for 


n < N, then E max Xpy < co. However, for continuous time, the inequality 
NE 


EX, < œ, t < T, does not in general mean that Esup X; < oo. Essentially, 
ist 
this is the main reason explaining why the result of the above lemma cannot be 
automatically extended to the case of continuous time. 
We note also that the local martingale in assertion I can indeed be.a ‘proper’ 
supermartingale and not a martingale. 
The following example is well known. 
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EXAMPLE. Let B = (B}, B?, B?)t>o be a three-dimensional Brownian motion. Let 


Ry = \/(B})? + (BP? + (B$)? with Ro = y/(B})? + (B2)? + (BB)? = 1. 


Then the process Ry (called the Bessel process of order 3) has the stochastic differ- 
ential 


dt 
dky= >= + dpt, Ro =1, 
Rt 


where 3 = (;)ty0 is a standard Brownian motion (see § 3a, Chapter II, and, e.g., 
[402]). 
By the It6 formula 


df (Rt) = f’(Rt) dR + 5 f(Re) dt 


(see Chapter III, § 3d and also [250], [402]) used for f (R+) = 1/R¢ (which is possible 
because the zero value is unattainable for the three-dimensional Bessel process R 


with Ro = 1) we obtain 
(a) 
Rt R 


or, in the integral form, 


1 q [abs 
R o RO 
fe t dBs s r 
The stochastic integral E is a local martingale (see [402]), therefore 
0 Rs t>0 


: 1 : ‘ 

the process X = (Xz)¢>0 with X; = BR and Xo = 1 is a local martingale, but not 
t 

a martingale. 


Indeed, by the self-similarity property (see Chapter II, § 3a) of the Brownian 
motions B!, B?, B® we obtain 
1 
y1 + (Bl — Bg)? + (B? — B2)? + (B? - BB)? 
1 
=E 
y1 +E? + & + &) 


EX:=E 


10 as t> œ, 


where the €;, i = 1,2,3, are independent standard normally distributed random 
variables (€; ~ 4(0,1)). On the other hand, if we had the martingale property, 
then the expectations EX; would be constant. 
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§1d. Gaussian and Conditionally Gaussian Models 


1. The concept of an ‘efficient’ market substantiates the ‘martingale conjecture’ 
for (discounted) prices. This makes ‘martingale’ a central notion in the analysis 
of the dynamics of prices regarded as stochastic sequences or processes having 
distributions with specific properties. However, the mere fact that the distributions 
have the martingale property is not sufficient for concrete calculations; one must 
know a ‘finer’ structure of these distributions, which brings one to the necessity of 
a thorough study of most diverse probabilistic and statistical models in order to 
find ones with distributions better conforming to the properties of the empirical 
distributions constructed on the basis of statistical data. The rest of this chapter 
is in fact devoted to achieving this aim. We present models enabling one to explain 
some or other properties discovered by the analysis of statistical ‘stock’ provided, 
in particular, by financial time series. 

The assumption that the distributions Law(h,...,hn) of the variables 
hi,..., hn are Gaussian is, of course, most attractive from the viewpoints of both 
theoretical analysis and the well-developed ‘statistics of the normal distribution’. 
However, one must take into account the fact (already pointed out in this book) 
that, as shows the statistical analysis of many financial series, this conjecture does 
not always reflect the behavior of prices properly. 

Seeking an alternative to the conjecture that the unconditional distributions 
Law(h1,..., An) for the sequence h = (hn)n>1 are Gaussian and bearing in mind the 
Doob decomposition, which was introduced in terms of the conditional expectations 
E(hn | ¥%n—1), it seems to be fairly natural to assume that these are conditional 
(rather than the unconditional) probability distributions Law (hn | ¥n—1) that are 
Gaussian. In other words, 


Law(hn | Fn—1) = N (un, o2) (1) 


for some ¥p—1-measurable variables pn = un(w) and o2 = o2(w).* 
More precisely, (1) means that the (regular) conditional distribution 
P(hn < «| Fn—1) can be described by the formula 


_ (yen (w))? 


P(hn < £| Fn—1)(w) 2on(w) dy 


1 T 
= ——- e 
V 2r02 is 
for alla € R and w E Q. 
In view of the regularity, we can (see [439; Chapter II, § 7]) evaluate the expec- 
tations E(hn | Fn—1)(w) by mere integration (for each fixed w € Q): 


CO 


EG Sine I E E E 


—0o 


“To avoid the consideration of trivial cases in which some special explanations are 
nevertheless necessary we shall assume throughout that o,(w) £ 0 for all n and w. 
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which, in our case, brings us to the formula 
E(hn | Fn—1) = Hn- (2) 


In a similar way, 


D(hn | Fn—1) = o2. (3) 


Thus, the ‘parameters’ un and o2 have a simple, ‘traditional’ meaning: they 
are the conditinal expectation and the conditional variance of the (conditional) 
distribution Law(hp, | Fn—1). 

The distribution Law(h,,) itself is, therefore, a ‘suspension’ (a ‘mixture’) of the 
conditionally Gaussian distributions Law(hn | #n—1) averaged over the distribution 
of un and of. 

We note that ‘mixtures’ of normal distributions “M (u, 07) with ‘random’ param- 
eters u = p(w) and g? = o? (w) form a rather broad class of distributions. We shall 
repeatedly come across particular cases of such distributions in what follows. 

Besides the sequence h = (hn), it is useful to consider the ‘standard’ condi- 
tionally Gaussian sequence € = (€n)n>1 of Fn-measurable random variables such 
that 

Law(en | Fn-1) = (0,1), where Fo = {8,9}. 


This sequence is obviously a martingale difference since E(én | Fn—1) = 0. More 
than that, this is a sequence of independent variables with standard normal distri- 
bution (0,1) because 


Law(én | €1,---,€n—1) = V(0,1). 


By our assumption above, on(w) # 0 (n > 1, w € Q), so that the variables 
En = (hn — Un)/on, n È 1, form a standard Gaussian sequence. Hence we can 
assume that the conditionally Gaussian (with respect to the flow (Fn) and the 
probability P) sequences h = (hn),n > 1, that we consider can be represented as 
sunis 


hn = un + OnEn, (4) 


where € = (£n) is a sequence of independent ¥,-measurable random variables with 
standard normal distribution “M (0,1). (As regards the representation of h = (hn) 
in the general case, when øn may vanish, see [303; Chapter 13, § 1].) 

It is clear that to study the probabilistic properties of the sequence h = (hn) 
(and therefore of S = (Sn)) we must specify further the structure of the variables 
in and o2. This is what we do in the models that follow. 

We note that in the study of the distributions of k = (hn), given that we prefer 
to deal with conditionally Gaussian ones, it is often reasonable to consider this 
property in the following framework. 
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Let (In) be a subfiltration of (Fn), i.e., assume that Gn C Fn and Gn C F413 
for instance, let In = Fyn-1. We assume now that Law (hn | Gn) = N (un, 2) with 
In = E(hn |n) and o2 = D(hn | Sn). Then, again, the distribution Law(hn) is a 
mixture of Gaussian distributions. 

We proceed now to several particular (linear and nonlinear) Gaussian and con- 
ditionally Gaussian models the variables un and on, n > 1, are described more 
concretely in their dependence on the ‘initial values’ (...,h—1, ho) and (...,€—1,€0) 
of h and e that must be set separately. 


2. AutoRegressive model of order p (AR(p)). In this model, it is assumed that 


Fn = o(€1,---,€n) (5) 

and 
lin = G9 + Ayhn—1 +*+: + Aphn—p, (6) 
On =o = Const (o > 0). (7) 


Thus, here we have 
hn = Hn + On€n = Q0 + ayhn—-1 + +++ + Aphn—p + Cen. 

To define the sequence h = (hn )n>1. Which is called the autoregressive model of 
order p, we must fix ‘initial values’, the variables hj_p,...,ho. If they are constants, 
then the sequence (hn )n>1 is not merely conditionally Gaussian but truly Gaussian. 
In § 2b, we consider the properties of the autoregressive model of order one (p = 1) 
in greater detail. 

3. Moving Average model (MA(q)). In this model, we fix the ‘initial values’ 
(E1-q,+++,€-1,€0) and set Fn = a(€1,.-.,En); 
bin = bo + b1En—1 + b2En—2 +*+- + bgEn—g; 


(8) 


On = o = Const 


so that 

hn = bo + b1€n—1 + b2En—=2 + +++ + bgEn—-g + En. (9) 
4. AutoRegressive Moving Average model of order (p,q) (ARMA(p,q)). 
We set Fn = o(€1,.-..€n), define the initial conditions (€]~g,...,€—1,€9) and 
(hi—p,-.+,—1, ho), and also set 


ln = (ao +aihn-1 ++ aplin-p) F (b1En-1 + bgén—2 +++ + bgEn—q) (10) 

and 
On = o = Const. 
Hence, in the ARMA(p,q) model we obtain 
hn = (ao +4a1lin—1 t+ +++ aphn—p) + (b1en—1 + b2en—2 +++ + bgen—q) ten. (11) 

All these models, AR(p), MA(q), and ARMA(p,q), are linear Gaussian models 
(provided that the ‘initial values’ are, say, constants). 

We proceed now to certain interesting conditionally Gaussian models that (by 
contrast to the above ones) are nonlinear. 
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5. AutoRegressive Conditional Heteroskedastic model (ARCH (p)). Once 
again, we assume that the sequence £ = (€n)n>1 is the (only) source of randomness 
and set Fn = o(€1,...,€n). Now let 


un = E(hn | Fn—1) = 9, (12) 
and 
p 
a? = E(h? | Fn-1) = a0 + X aih? i, (13) 
i=1 
where ag > 0, aj > 0, i = 1,...,p, Fo = {2,9}, and hy_p,...,ho are certain 
initial constants. 
In other words, the conditional variance of is a function of Be ois a5 gle aes 


This model, which was introduced (as already mentioned in Chapter I, § 2e) by 
R. F. Engle [140] (1982), proved to be rather successful in the explanation of several 
nontrivial properties of financial time series, such as the phenomenon of the cluster 
property for the values of the variables hp. 
Thus, 
hn = On€n, n>1, (14) 


where £ = (en) is a sequence of independent, normally distributed random variables, 
En ~ N (0,1), and the o2 are defined by (13). 
If 
Hn = 09 + Qyhp—a +--+ + arhn-r (15) 


in place of (12) and the o? satisfy (13), then (4) takes the following form: 
hn = ao + aihn-1 +--+ +arhn-r + Onén- (16) 


Such models are sometimes denoted by AR(r)/ARCH(p). 
We now set (assuming that Eh? < 00) 


Vn = h? — o2. (17) 
Then 
p 
h? = ao +) aih? i+ vn, (18) 
i=l 
where 


E(un | ¥n—1) = E(h? | Fn-1)- 02 =0 


by (13), i.e., the sequence v = (vn) is a martingale difference. 
TŁus, we can regard the ARCH (p) model as the autoregressive model AR(p) 
for the sequence (h2) with ‘noise’ v = (vn) that is a martingale difference. 
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6. Generalized AutoRegressive Conditional Heteroskedastic model 
(GARCH(p,q)). The successes with the application of ARCH (p) encouraged 
the development of various generalizations, refinements, and modifications of this 
model. 

The GARCH (p,q) model described below and introduced by T. Bollerslev [48] 
(1986) is one such modification. 

As before, let 

Hn = E(hn | Fn-1) = 0, 


but now, in place of (13) we assume that 


p q 
o? = E(h? | Fn-1)= a9 + X aih it Y Bjo?_,, (19) 
i=1 j=1 
where ag > 0, aj, 8j > 0 and where (hi-p,--., ho), (o2_ zi N) are the ‘initial 


values’, which we can for simplicity set to be constants. 
In the GARCH (p,q) model we set 


hn = OnEns (20) 


where (€1,€2,...) is a sequence of independent identically distributed random vari- 
ables with distribution (0,1) and the g2 satisfy (19). 
Now let 


h? rZ Ye h- —i> (21) 


where L is the lag operator (Lih? _ 1= = hhn- see § 2a.2 below), and 


q 
L)on-1 = > Bion j. (22) 
j=l 


In this notation 
o = œo + a(L)h2_, + Byer =; 


2 


Setting vn = h2 — o2 as above, we now obtain 
8 n n 2 


h2 = vn + 02 = vn + ao + a(L)h?_ 1+ BCL N(R? n—-1 — Vn—1) 
= ao + (a(L) + B(L)) haa — B(L)un-1 + vn. 


In other terms, 


h2, = ao + (a(L) + B(L))h2_y + vn — B(L)vn-1. (23) 
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Hence we can regard the GARCH(p,q) model as the autoregressive moving 
average model ARMA(max(p,q),q) for the sequence (h2) with ‘noise’ (vn) that 
is a martingale difference. 

In particular, setting Vn = hn = o2 in ARCH(1) with 


hn = OnEn and of = œo + arh 


we see that 
n2 =ag+ ayh2_, + Un, 


where the ‘noise’ (vn) is a martingale difference. 

There exist various generalizations of the ARCH and GARCH models (e.g., 
EGARCH, AGARCH, STARCH, NARCH, MARCH, HARCH) related, in the 
final analysis, to one or another description of the variables 02 = E(h? | Fn—1) as 
measurable functions with respect to the o-algebras Fn—1 = o(€1,...,€n—1)- 


7. Stochastic volatility model. In the previous models, we had a single source of 

randomness, a Gaussian sequence of independent variables € = (en). The stochastic 

volatility models involve two sources of randomness, € = (En) and 6 = (ôn). In 

the most elementary case they are assumed to be independent standard Gaussian 

sequences, that is, sequences of independent, M (0, 1)-distributed random variables. 
Let Yn. = o(41,...,6n). We set 


hn = On€En, (24) 
where the on are %,-measurable. Then it is clear that 
Law (hn | Gn) = M (0, 02), (25) 


ie., the -conditional distribution of hn is Gaussian, with parameters 0 and oe. 
We now set 
dA 
On = E27". (26) 


Then o2 = e4n, where the An are G,-measurable. There exist highly popular 
n >, Buy pop 


models in which the sequence (A,,) is governed by an autoregressive model (we 
shall write (An) € AR(p)) as follows: 


An = a9 +a; An-1 +++: + ApAn—pt côn. 
A natural generalization of (24) is the scheme with 
hyn = Un + On€n, (27) 


where pi, and on are 4,-measurable. 

If € = (en) in (27) is a normally distributed stationary sequence with Ee, = 0 
and Ee2 = 1, while ø = (øn) is independent of € = (en), then we arrive at the 
so-called Taylor model. 

We complete our brief discussion of several Gaussian and conditionally Gaussian 
models used in financial mathematics and financial engineering at this point. We 
study the properties of these models more closely below, in §§2 and 3. 
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§le. Binomial Model of Price Evolution 


1. An exceptional role in probability theory is played by the Bernoulli scheme, the 
sequence 


ô = (6), 62,...) 
of independent identically distributed random variables that take only two values, 
say, 1 and 0, with probabilities p and q, p+q = 1. 

It was for this scheme that the first limit theorem of probability theory, the Law 
of large numbers, was proved (J. Bernoulli “Ars Conjectandi”, 1713). It states that 
for each € > 0, 

| 


Sn (itin 


S 
_pl>e) 30 as n> œ, 
n 


is the frequency of ‘ones’ in the sequence 61,..., dn. 


where 
n n 
Likewise, it was for this scheme that several other remarkable results of proba- 


bility theory (the de Moivre-Laplace limit theorem, the Strong law of large numbers, 
the Law of the iterated logaritm, the Arcsine law, ...) were first proved. Later on, 
all these results turned out to have a much broader range for applications. 

The role in financial mathematics of the Cor-Ross—Rubinstein binomial model 
[82] described below is close in this sense to that of the Bernoulli scheme in classical 
probability theory; very simple as it is, this model enables complete calculations 
of many financial characteristics and instruments: rational option prices, hedging 
strategies, and so on (see Chapter VII below). 


2. We assume that all the financial operations proceed on a (B, S)-market formed 
a bank account B = (Bn)n>o and some stock of value S = (Sp)nso- 
We can represent the evolution of B and S as follows: 


Bn = (1 + rn)Bn-1, (1) 
Sn = (1+ pn) Sn-1, (2) 
or, equivalently, 
A Bn = Tn Bn-1. 
ASn = PnSn-1 


where Bo > 0 and So > 0. 
The main distinction between a bank account and stock is the fact that the 
bank interest rate 
Tn is ¥,—1-measurable, 


while the ‘market’ interest on the stock 
pn is Fn-measurable, 


where (Fn) is the filtration (information flow) on a given probability space (Q, F, P). 
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In the framework of the Cox—Ross-—Rubinstein binomial model (the "CRR-model’) 
of a (B, S)-market one assumes that 


Tn =r = Const 
and that p = (pn)n>1 is a Bernoulli sequence of independent identically distributed 


random variables p1, p2,... taking two values, a or b, a < b. 
We can write each pn as the following sum: 


a+b b-a 
Pn = 2 + 2 En, (3) 
or as 
pn = a + (b — abn, (4) 


to find out that 


Our assumptions that rn = Const, while the pn take only two values, enable us 
to set the original probability space Q from the very beginning to be the space of 
binary sequences: 


Q = {a,b}? or Q={-1,1}*, or Q={0,1}°. 


By (2), 
Sn = So [| C + px): (5) 


kgn 
Comparing this expression and (5) in § 1a we see that the pẹ play the role of the 
variables hą introduced there. Clearly, we can also represent the Sp as follows: 


Sn = Soefn = Soest thn | 


where hn = In(1+ pẹ) (cf. (1) and (10) in § 1a). 


3. We now discuss the particular case of a and b satisfying the relations 
a=)! A, b=A-1 


for some A > 1. 
In this case 


ASn—1 if pn = b, 
Sn = i i 
A` Sn—-1 if pn =a. 
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Regarding (3) as the definition of ¢,(= +1), we can represent the Sn as follows: 
Sn = SAET tEn, (7) 


which is the same as 
Sn = Soet thn (8) 


with hy = €k Ind. 
This random sequence S = (Sn) is called a geometric random walk on the set 


Eg, = {S0A": k = 0, £1,...}. 


If So € E = {A}: k =0,41,...}, then Es, = E. In this case S = (Sn) is called a 
Markov walk on the phase space E = {AF : k = 0,+1,...}. 

The binomial model in question is a discrete analogue of a geometric Brownian 
motion S = (St)tz0, i-e., of a random process having the following representation 
(cf. (8)): 


Sp = Sper Wet (Ho? /2)t 


where W = (Wr)t>o is a standard Wiener process or a standard Brownian motion 
(see Chapter IIT, § 3a). 

It seems appropriate to recall in this connection that it is an arithmetic random 
walk Sn = Sp-1 +&n with a Bernoulli sequence € = (€n)n51 that is a discrete 
counterpart to usual Brownian motion. 


4, We started our previous discussion with the assumption that all considerations 
proceed in the framework of some filtered probability space (Q, F, (Fn), P) with 
probability measure. The question of the properties of this probability measure P 
and, therefore, of the quantities p = P(p, = b) and q = P(pp = a) is not so easy 
in general. It would be more realistic in a certain sense to assume that there is an 
entire family 4% = {P} of probability measures in (Q, F), rather than a single one, 
and that the corresponding values of p = P(pn = b) belong to the interval (0,1). 

As regards the issue of possible generalizations of the binomial model just dis- 
cussed, it would be reasonable to assume that the variables pn can take all values 
in the interval [a,b] rather than only the two values a and b. In that case, the 
probability distribution for pn can be an arbitrary distribution on [a, b]. This is the 
model we shall consider in Chapter V, § 1c in connection with the calculations of 
rational option prices on so-called incomplete markets. In the same section we shall 
also consider a nonprobabilistic approach based on the assumption that the pn are 
‘chaotic’ variables. (As regards the description of the evolution of prices involving 
models of ‘dynamical chaos’, see § § 4a, b below.) 
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§1f. Models with Discrete Intervention of Chance 


1. In the study of sequences H = (Hn)n>o, it is often convenient to ‘embed’ them 
(in the sense that we explain below) in certain schemes with continuous time. 

Given a stochastic basis 4 = (Q, F, (Fn)nz0, P) we consider an associated basis 
@ with continuous time (t > 0) defines as follows: 


B = (2, F, (Fi)tz0;P), 


where F; = Fy and [t] is the entire part of t. 
Given a stochastic sequence H = (Hn, Fn), we introduce the process 
H = (Ht, Ža) (with continuous time) by setting? 


Ay = Hy. 


Thus (see Fig. 13), the trajectories H, t > 0, are piecewise constant and right- 
continuous, making jumps AH, = H - Hi for t = 1,2,.... Moreover, AH, = 
AHn = hy- y R 

Clearly, the converse is also true: a random process H = (Ht, Ft)t>0 on a prob- 
ability space B= (Q, F, (Fi)eso, P) that has piecewise constant right-continuous 
trajectories making jumps for t = 1,2,..., is actually a process with discrete time 
of the above form. 


Hi H3 
Se a 
l 
1 
1 
Ay J 
— | 
l 1H» | 
i —_________» 
1 
| 
1 
7 + + > 
0 1 2 3 4 t 


FIGURE 13. Embedding a sequence (Hn) with discrete time in a 
scheine with continuous time 


2. So far, in our discussion of various models for the dynamics of prices we have 
either (in most cases) considered models with prices S = (Sp) registered at discrete 
instants of time n = 0,1,..., or (as in the Bachelier case) models with prices 


b As in the dicrete-time case, we write H= (He, Fi) to indicate that the variables H 
arc ¥,-measurable for each £ > 0. 
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S = (St)tp0 described by a continuous random process (e.g., the Brownian motion) 
with continuous time t > 0. 
In practice, the statistical analysis (see Chapter IV) of the evolution of actual 
prices S = (St)t>0 shows that they have a hybrid structure in a certain sense. 
More precisely, this means the following. According to a large stock of data, the 
trajectories of the prices S = (St)ty0 look like in the picture below: 


À 
St 
—— 
i 
a ——_— l 
l i 
t 1 l 
l 1 I 
So l 1 
a 
— + + > 
0 Ti T2: T3 t 


FIGURE 14. A process with discrete intervention of chance 
(at times T1, 72,73,--.) 


In other words, representing S; by the formula S; = Soe™t we see that 


Hy = X hel (Tk < t), (1) 
k21 
where 71,T2,... are the instants of jumps and hę are their amplitudes 


(AHA;, = Hr 7 Hys T hk). 

Assuming that all our considerations proceed with respect to a stochastic basis 
(Q, F, (Fi)t>0, P), it would be natural to impose certain ‘measurability’ conditions 
on the (random) instants of jumps Tk, k > 1, and the variables hy, k > 1, so as to 
ensure, at any rate, that Hy be %:-measurable for each t > 0, i.e., can be defined 
by the ‘data’ accessible to an observer over the period [0, t]. 


3. To this end we introduce several definitions in which by ‘an extended random 
variable’ r = T(w) we mean a map® 


Q > R+ = [0, œ]. 


Recall that, in accordance with the traditional probabilistic definitions, a real-valued 
random variable can take only finite values, i.e., values in R = (—00, 00). 
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DEFINITION 1. We call a nonnegative random value T = T(w) a Markov time or a 
random valriable independent of the ‘future’ if 


fw: T(w) < t) E Fi (2) 


for each t > 0. 


Markov times are also called stopping times, though sometimes one reserves the 
latter term for the Markov times T = r(w) such that either r(w) < œ for al w EQ, 
or P(r(w) < co) = 1. 

The meaning of condition (2) becomes fairly obvious if we interpret T(w) as the 
instant when one must make certain ‘decisions’ (e.g., to buy or to sell stock). Since 
Fy, is a o-algebra, (2) is equivalent to the condition {T(w) > t} € F, meaning 
that one’s intention at time ¢ to postpone this ‘decision’ until later is determined 
by the information ¥; accessible over the period [0,¢], and one cannot take into 
consideration the ‘future’ (i.e., what might happen after time ¢). 


DEFINITION 2. Let r = T(w) be some stopping time. Then we denote by F, the 
collection of sets A € F such that 


AN fw: tw) St} E F (3) 


for each t € R+ = [0, 00). 
By Zr- we denote the o-algebra generated by Fo and all the sets 
An{w: r(w) < t}, where Ac Fi and t € R4. 


It is easy to see that F, is a o-algebra. 

If we interpret ¥; as the collection of all events occurred not later than time t, 
then it is reasonable to interpret F, and ¥,~— as the collections of events observable 
over the periods [0,7] and [0,7), respectively. 


4. We now turn back to the representation (1) of the process H = (At)ts0 as 
a process with discrete intervention of chance (at instants of time 71,72,...5 see 
Fig. 14), We assume that 


0< T1(w) < T2(w) ate 
for all (or P-almost all) w € Q and that 7 = T,(w) is a stopping time for each 
k > 1, while the variables hy, = hy,(w) are ¥,,-measurable. 
It follows, in particular, from these assumptions and conditions (2) and (3) that 
the process H = (Hi)tpo is adapted to the flow (F:)i>0, Le., 


A is $¥-measurable 


for each t > 0. 
Thus, in accordance with the above conventions, we can write H = (Hi, ¥t)t>0. 


1. Probabilistic Concepts and Several Models of the Dynamics of Market Prices 115 


The real-valued process H = (H+)t>o defined by (1) is a particular case of so- 
called multivariate point processes ranging in the phase space E ([250]; in our case 
we have E = R), while the term ‘point (or counting) process’ in the proper sense is 
related to the case of hn = 1, ie., to the process 


N= DOI <t), t>0. (4) 
k 


Remark. Sometimes, one relates ‘point’ processes only to a sequence of instants 
T = (T1, T2,..-) and reserves the attribute ‘counting’ for the process N = (N:)ts0, 
corresponding to this sequence in accordance with formula (4). It is clear that there 
exists a one-to-one correspondence between T and N: we can determine N by 7 in 
accordance with (4), and we can determine T by N because 


Th = inf {t: Nt = k} 
We note that, as usual, we set here Tk(w) to be equal to +00 if {t: Ne = k} = Ø. 
For instance, considering the stopping times corresponding to the ‘one-point’ point 


process with trajectories as in Fig. 15 we can say that T2 = T3 =--- = +00. 


Nt 


FIGURE 15. ‘One-point’ point process 


5. The point processes N = (Nz)t50 defined in (4) can be described in an equivalent 
way using the concept of a random change of time [250; Chapter II, § 3b]. 


DEFINITION 3. We say that a family of random variables ¢ = (o+)t30 defined 
on a stochastic basis @ = (Q, F, (Fn)nen, P) and ranging in N = {0,1,...} or 
N = {0,1,...,00} is a random change of time if 


(i) the of are stopping times (with respect to (¥n)nen) for all t; 
(ii) oo = 0; 
(iii) each trajectory o;(w), t > 0, is increasing, right-continuous, with jumps of 
amplitude one. 


We now associate with o = (ot)t>o the stochastic basis #7 


BI = (Q, F, (Fo, )t>0, P) 
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with continuous time and set 
Tk = infit: of È ky, keN. 


It is easy to show that the Tk = 7,(w) are stopping times with respect to the 
flow (%)tso such that 4, = Fo, Moreover, if ap < oo for all t € N, then 


H= Y hkl <t)= So he (5) 


k>1 1<k<ot 


It is also clear that 


ie., op is the number of decisions taken over the period [0,¢]. To put it another way, 
the random change of time o = (o¢)¢>0 constructed by the sequence (71, 72,...) is 
just the counting process N = (Nz) associated with this sequence, i.e., the process 


Nps >) I(Tk <t), 


k>1 


In all the above formulas ¢ plays the role of real (‘physical’) time, while o¢ 
plays the role of operational time counting the ‘decisions’ taken in ‘physical’ time t, 
(We return to the issue of operational time in the empirical analysis of financial 
time series in Chapter IV, § 3d.) 

We note also that if 

or = [t], 


then Be = B, where @ is the above-mentioned stochastic basis with discrete time 
(Q, F, (Fijiz0, P), where Fi = Fie 


6. As follows from § 1b, the Doob decomposition plays a key role in the stochastic 
analysis of the sequence H = (Hn)n yo by enabling one to distinguish the ‘mar- 
tingale’ and the ‘predictable’ components of H in their dependence on the flow 
(Fn)n>o of incoming data (on the market situation, in the financial framework), 

A similar decomposition (the Doob-Meyer decomposition) can be constructed 
both for the counting process N and for the multivariate point process H. This (as 
in the discrete-time case) is the main starting point in the stochastic analysis of these 
processes involving the concepts of ‘marginal’ and ‘predictability’ (see Chapter ITI, 
§ 5b and, in greater detail, [250]). 


2. Linear Stochastic Models 


The empirical analysis of the evolution of financial (and, of course, many other: 
economic, social, and so on) indexes, characteristics, ... must start with construct- 
ing an appropriate probabilistic, statistical (or some other) model; its proper choice 
is a rather complicated task. 

The General theory of time series has various ‘standard’ linear models in its 
store. The first to mention are MA(q) (the moving average model of order q), 
AR(p) (the autoregressive model of order p), and ARMA(p,q) (the mized autore- 
gressive moving average model of order (p,q)) considered in § 1d. These models are 
thoroughly studied in the theory or time series, especially if one assumes that they 
are stationary. 

The reasons for the popularity of these models are their simplicity on the one 
hand and. on the other, the fact that, even with few parameters involved, these 
models deliver good approximations of stationary sequences in a rather broad class. 

However, the class of ‘econometric’ time series is far from being exhausted by 
stationary series. Often, one can fairly clearly discern the following three ingredients 
of statistical data: 


e a slowly changing (e.g., ‘inflationary’) trend component (x); 
e periodic or aperiodic cycles (y), 
e an irregular, fluctuating (‘stochastic’ or ‘chaotic’) component (z). 


These components can be combined in the observable data (h) in diverse ways. 
Schematically, we shall write this down as 


h=2*xy* 2, 


where the symbol ‘x’ can denote, e.g., addition ‘+’, multiplication‘ x’, and so on. 

There are many monographs devoted to the theory of time series’ and their 
applications to the analysis of financial data (see, e.g., [62], [193], [202], [211], [212], 
[351], or [460).) 
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In what follows, we discuss several linear (and after that, nonlinear) models 
with the intention to give an idea of their structure, peculiarities, and properties 
used in the empirical analysis of financial data. 

It should be mentioned here that, in the long run, one of the important objectives 
in the empirical analysis of the statistics of financial indexes is to predict, forecast 
the ‘future dynamics of prices’: 


4 
St 


© 


the predicted 
‘most probable’ 
price dynamics 


© 


l 
l 
l 
l 
l 
l 
l 


0 ‘present’ t 
FIGURE 16. The region between the curves 1 and 2 is the confi- 


dence domain (corresponding to a certain degree of confidence), 
in which the prices are supposed to evolve in the future 


The reliability of these predictions depends, of course, on a successful choice of 
a model, the precision in the evaluation of the parameters of this model, and the 
quality of extrapolation (linear or nonlinear). 

The analysis of time series that we carry out in Chapter IV is expressive in that 
respect. We show there how, starting from simple linear Gaussian models, one is 
forced to modify thein, make then more complez in order to obtain, finally, a model 
‘capturing’ the phenomena discovered by the empirical analysis (e.g., the failure of 
the Gaussian property, the ‘cluster’ property, and the effect of ‘long memory’ in 
prices). 

Recalling our scheme ‘h = z *y*z’ of the interplay between the three ingredients 
T, y, and z of the prices h, we can say that most weight in our discussion will be 
put on the last, ‘fluctuating’? component z, and then (in the case of exchange rates) 
on the periodic (seasonal) component y. 

We shall not discuss the details of the analysis of the trend component z, 
but we point out that it is often responsible for the ‘nonstationary’ behavior of 
the models under consideration. A classical example is provided by the so-called 
ARIMA(p, d,q) models (AutoRegressive Integrated Moving Average models; here d 
is the order of integration) that are widely used and promoted by G. E. P. Box and 
G. M. Jenkins ([53}; see § 2c below for greater detail). 
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§2a. Moving Average Model MA(q) 
1. It is assumed in all the models that follow (either linear or nonlinear) that we 
have a certain ‘basic’ sequence € = (En), which is assumed to be white noise (see 


Fig. 17) in the theory of time series and is identified with the source of randomness 
responsible for the stochastic behavior of the statistical objects in question. 


0.25 + 
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FIGURE 17. Computer simulation of ‘white noise’ hn = En with o = 0.1 
and En ~ NV (0,1) 


Alternatively, it is assumed (in the ‘Z?-theory’) that a sequence e = (en) is white 
noise in the wide sense, i.e., Ee, = 0, Ee? < oo, and 


Eenem = 0 (1) 


for all n A m. (It is convenient to assume at this point that the time parameter n 
can take the values 0, +1, +2,....) 

In other words, white noise in the wide sense is a square integrable sequence of 
uncorrelated random variables with zero expectations. 

If we require additionally that these variables be Gaussian (normal), then we 
call the sequence € = (en) white noise in the strict sense or simply white noise. This 
is equivalent to the condition that € = (€n) is a sequence of independent normally 
distributed (en ~ 4(0,02)) random variables. In what follows we assume that 
o2 = 1. (One says in this case that € = (en) is a standard Gaussian sequence; it is 
instructive to compare this concept with that of a fractal Gaussian noise, which is 
also used in the financial data statistics, see Chapter III, § 2d.) 


2. In the moving average scheme MA(q) describing the evolution of the sequence 
h = (hp), one assumes that the hn can be constructed from the ‘basic’ sequence € 


120 Chapter II. Stochastic Models. Discrete Time 


as follows: 
hn = (te + biEn—=1 ++: + bqEn—q) + b0En- (2) 


Here q is a parameter describing the degree of our dependence on the ‘past’ while 
En ‘updates’ the information contained in the g-algebra Fn—1 = F(En—1,€n—25+-+)5 
see Fig. 18. 
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FIGURE 18. Computer simulation a sequence h = (hn) governed by 
the MA(1) model with An = p +b1En—1 + boen, where p = 1, b1 =1, 
bo = 0.1, and én ~ ~ (0,1) 


For brevity, it is convenient to introduce the lag operator L acting on number 
sequences 7 = (£p) by the formula 


Lan = En—1. (3) 


Since L(Lx,) = Lan-1 = tn—z2, it is natural to use the notation L2 for the 
operator 
2 
Dan = In-2- 


In general. we set L* ry, = n-p for all k > 0. 
We point out the following useful, albeit simple, properties of the operator L 
(here c, c1, and cy are arbitrary constants): 


L(ctn) = cLan, 
L(an+ Yn) = Lan + Lyn, 
(aL + caL? ary, = c1 L£n + caL £n = C1Tn—1 + CoTn_2, 
(1—A,L)(1 — Ag Lh) tn = Tn — (Ay + à2)£n—1 + (A1A2) tn_2. 
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Using L we can write (2) in the following compact form: 
hn = p+ B(L)en, (4) 


where 
BIL) = bon +O, L4+-->+ bL. 


3. We now consider the question on the probabilistic characteristics of the sequence 
h = (hp). 
Let q = 1. Then 
hn = p + boen + b1En-1, (5) 


and we immediately see that 


Ehn =p,  Dhn=b +t, (6) 
Cov(hn, hn+1) = bobt, Cov(hn, hnk) = 0, &>1. (7) 


The last two properties mean that h = (hn) is a sequence with correlated con- 
secutive elements (hn and hp+1), while for k > 2 the correlation between the hp, 
and the hy, is zero. 

We note also that if 69b; > 0, then the elements hn and h,+1 have positive 
correlatiou, while if bb, < 0, then their correlation is negative. (We come across a 
similar situation in Chapter IV, § 3c, when explaining the phenomenon of negative 
correlation in the case of exchange rates.) 

By (6) and (7), the elements of h = (hn) have mean values, variances, and 
covariances independent of n. (Of course, this is already incorporated into our 
assumption that € = (en) is white noise in the wide sense with Een = 0 and 
De2 = 1 and that the coefficients in (5) are independent of n.) Thus, the sequence 
h = (hn) is (just by definition) stationary in the wide sense. If we assume in 
addition that € = (en) is a Gaussian sequence, then the sequence h = (hn) is also 
Gaussian and, therefore, all its probabilistic properties can be expressed in terms 
of expectation, variance, and covariance. In this case, the sequence h = (hn) is 
stationary in the strict seuse, i.e., 


Law (hi MESS hin) = Law(hi +k, bida hin+k) 


for all n > 1, i1, ..., în, and arbitrary k. 


4. Let a trajectory (h1,...,hn) be a result of observations of the variables hy, at 
instants of time k = 1,...,n and let 


1 nm 
hn = n2 (8) 


be the time average. 
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From the statistical viewpoint, the point of considering the ‘statistics’ hn lies in 
the fact that belief that it is a natural ‘choice’ for the role of an estimator of the 
expectation p. 

If we measure the quality of this estimator by the value of the standard deviation 
A? = E |hn — u|?, then the following test is useful: as n + oo we have 


1 nm 
An 0 <> =) RG) +0, (9) 
k=1 


where R(k) = Cov(hn, hn+k) and k = (hn) is an arbitrary sequence that is station- 
ary in the wide sense. 
Indeed, let (for simplicity) u = Ehn = 0. Then 


fice Teg A 
X ; 2 X ) 
k=1 k=1 


therefore the implication ==> in (9) holds. 
On the other hand, 


1 nm 
ae 


2 


H 


n 
= hj, + a 


=1 kel 
2 n l-1 
Ae » R(k) — TRO 
l=1 k=0 
We now choose 6 > 0 and we choose n = n(d) such that 
1 I-1 
TRE) <4 
k=0 
for all / > n(6). Then 
IH 1 n{d) 1-1 n 
LEY Rwl=|o TRH +s D Ero) 
l=1 k=0 1=1 k=0 l=n(e)+1 k=0 
n{d) 1—1 1 n 1i 
< ald Rw] + | D eg ER 
1=1 k=0 I=n(6)+1 k=0 
n(5) 1-1 
=> SRK | +6 


l=1 k=0 
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for n > n(d), and since n(5) < œ and |R(0)| < Const, it follows that 


Since 6 > 0 was chosen arbitrary, this proves the implication <<. 

Thus, the model MA(1) is ‘ergodic’ in the following sense: the time averages 
Rn converge (in L?) to the mean value u, or, as it is alternatively called, to the 
ensemble average. (As regards the general concepts of ‘ergodicity’, ‘mixing’, ‘ergodic 
theorems’, see, e.g., [439; Chapter V].) 

The importance of this ‘ergodic’ result from the statistical viewpoint is com- 
pletely clear: it substantiates the estimation of mean values in terms of time aver- 
ages calculated from the observable values of hy, ha,... . It is clear that ‘ergodicity’ 
plays a key role in the substantiation of the statistical methods of estimation from 
samples not only in the case of mean values, but also for other characteristics: 
moments of various orders, covariances, and so on. 


5. We recall that, given a covariance Cov(hn, hm), we can define the correlation 
Corr(hn, hm) by the formula 


Covthinh 
Corr(hn, hm) = Corlim hn, 
nm m 


By the Cauchy—Schwartz—Bunyakovskii inequality, | Corr(hn, hm)| < 1. 
In the stationary case, Cov(hn, hn+p) is independent of n. We denote this value 
by R(k) and set 


(10) 


p(k) = Corr(hins hn4 ke) = aot (11) 


= 


Then we obtain by (7) that, in the model MA(1), 


ir k=0, 
bo by 
k) = 7). 22? k=1, 12 
p=) Te (12) 
0, k>l. 


It is interesting that if we set 6, = b1 /bọo, then 


6, (A) 
1402 14 (1/61)? 


p(1) (13) 
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6. We now consider the model MA(q): 

hn = e+ B(Den, where (L) = bo +biL +--+ bgl’. 
It is easy to sec that 


Ehn = p, 
Dhn = bj +b? +- +b? 


and 
q—k 
boria Ike iei 
R) = d 2 i eH i (14) 
0, k>q. 


It is clear from (14) that schemes of type MA(q) can be used to simulate (by varying 
the coefficients b;) the behavior of sequences h = (hn) with no correlation between 
the variables hn and h,,4, for k >q. 


Remark. In our discussion of the adjustment of one or another model to empirical 
data, or, as we put it above, ‘the simulation of the behavior of the sequences 
k = (hn), we should mention that the general pattern here is as follows. 

Given sainple values h1, hk2,..., we determine first certain empirical character- 
istics, e.g., the sample mean 


’ 


n ke 
hn = — X hp, 
k=1 


the sanple variance 
n 


i 1 > 
a = n X (hk A hn)”, 


k=1 


the sample correlation (of order k) 


ekg (hi = hn) (hi-k F hn) 


AD $ 
nO, 


Tn(k) 


the sample partial correlations, and so ou. 

After that, using the formulas for the corresponding theoretical characteristics 
(as, e.g., (12) or (14)) in the models that we are approximating, we vary the pa- 
rameters (such as, for instance, the b; in (12) and (14)) to adjust the theoretical 
characteristics to the empirical. Finally, at the last stage, we estimate the quality 
of this adjustment using our knowledge about the distributions of the empirical 
characteristics and their deviations from the theoretical distributions. 
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7. The next natural step after the models MA(q) is to consider MA(ov), i.e., models 
with 
lee) 
hn = wt >> bjen—j- (15) 
j=0 
Of course, we must impose certain conditions on the coefficients to ensure that the 
sum in (15) is convergent. If we require that 


X 
Sou < œ, (16) 
j=0 


than the series in (15) converges in mean square. 
Under this assumption, 


D 
Pies: Dins) bh, (17) 
j=0 


and 


oO 
R(k) = So bey jbj, k20. (18) 
= 


Considering the representation (15) for h = (hn) it is usually said in the theory 
of stationary stocliastic processes that the h is the ‘output of a physically realizable 
filter with impulse response b = (bj) and with input € = (én)’. 

It is remarkable that, in a way, each ‘regular’, stationary (in the wide sense) 
sequence h = (hn) can be represented as a sum (15) with property (16). As regards 
the precise formulation and all the totality of the problems relating to the Wold 
expansion of a stationary sequence in a sum of ‘singular’ and ‘regular’ components 
such that the latter are representable as in (15), see § 2d below and, in greater 
detail, e.g., [439; Chapter VI, § 5]. 


§2b. Autoregressive Model AR(p) 


1. By the definition in § 1d, a sequence h = (hn)n>1 18 said to be governed by the 
autoregressive model (scheme) AR(p) of order p if 


hn = Un + GEn, (1) 


where 
Hn = a9 + os i Ft Aphn_p. (2) 


We can put it another way by saying that h = (hn) satisfies the difference 
equation of order p 


hn = ao + @1ħn-1 +++: + aphn-p + Fn, (3) 
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which, using the operator L introduced in § 2a, can be rewritten as 


(1 — a, L -—-++- apL?)hn = ao + dEn, (4) 
or, in a more compact form, 
a(L)hn = wn, (5) 
where o(L) = 1 -aiL — +++ — apl’, Wn = a9 + dEn. 


As already mentioned in § 1d, for a complete description of the evolution of the 
sequence h = (hn) governed by the difference equation (3) we must also set ‘initial 


values’ (hj—p, ha—p,..-, ho). 
One often sets hy_» = ++: = họ = 0, or one can assume that these are random 
variables independent of the sequence €1,€2,.... In ‘ergodic’ cases the asymptotic 


behavior of the hy, as n — œ is independent on the ‘initial’ conditions, and in this 
sense, the initial data are not that important. Still, we describe precisely all our 
assumptions concerning the ‘initial’ conditions in what follows. 

We present the results of a computer simulation of the autoregressive 
model AR(2) in Fig. 19. 


0.44 


0.3 


—0.3 | 

0 100 200 300 400 500 600 700 800 900 1000 
FIGURE 19. Computer simulation of a sequence h = (hn) (2 < n < 1000) 
governed by the AR(2) model with hyn = ao +a1hn—1 +a2hn-2+0€n, 
where ag = 0, a) = —0.5, a2 = 0.01, o = 0.1, and hp = hy = 0 


2. First, we consider in detail the simple case of p = 1, where 
hn = ao + aihn—-1 + CEn. (6) 


It can be distinguished from the general class of models AR(p) by the property 
that, of all the ‘past-related’ variables hy—1,hn-2,...,hn—p involved in (3), the 
contribution in hn is made only by the closest in time variable h,_}. 
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If € = (En)n>1 is a sequence of independent random variables and hg is inde- 
pendent of £, then the sequence h = (hn)n 1 becomes a classical example of a 
constructively defined Markov chain. 

From (6) we find recursively that 


n-1 


)+atho t+ of€n + a1€n-1 + +a 


n-1 


hn = a0o(l1 +a1 +: +a} E1) (7) 


Hence the properties of the sequence h essentially depend on the values of the 
parameter aj. We must distinguish between the three cases of |a1| < 1, jai] = 1, 
and |a;| > 1, where the case of |a| = 1 plays the role of a ‘boundary’ of sorts, 
which we explain in what follows. 
By (7) we obtain 
Ehn = a'Eho + ao(1 + a1 ++ + a77 }), 
2(n-1 
Dhn =a? Dho +0? (1 +a} ++ a0 )), 
Cov(hin, hn—k) = a?” Dlg + oak (1 fatter + ah) 


forn-~k>1. 
It is clear from these relations that if |a,| < 1 and Elhko| < oo, then 


t= n 
S E a E 
l1—ai l-ai 


as n — œ and (if Dho < oo) 


2 2n 2 
o (1 — 
Dhn = a?” Dho + ZAAT) “5, 
l-ai 1l- aj 
2k 
a 
Cov(hn, hn 4) —t —— 5. 
l—ai 


Hence, the sequence h = (hn)n>0 approaches in this case (|a;| < 1) a steady 
state as n > oo. Moreover, if the initial distribution (the distribution of ho) is 


normal, i.e., 
2 
ao o 
ho TN N 2 2 ` 
l—a, l-a 


then h = (hn) n> is a stationary Gaussian sequence (both in the wide and the 
strict sense) with 


ag o 


Dhn = (8) 


Ehn = f 
i 1 — aj 
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and 
2k 
o'a 
Cov(hn, hn4k) = L. (9) 
l—ai 
We recall that a sequence is stationary in the strict sense if 
Law(họo, h1,- .., hm) = Law(hk, highs --+, Bmtk) 
for all admissible values of m and k, while it is stationary in the wide sense if 
Law(hj, hj) = Law(hi+k hj+r)- 
If 
Cov(hn, hn+tk) 
k) = Corr(hn, h See EEE 
p(k) = Cor ns nsa) = Fp et 
then we obtain by (8) and (9) that 
p(k) = at (10) 


for |a] < 1, i.e., the correlation between the variables hn and h,+4, decreases 
geometrically. 

Comparing the representation (7) and formula (2) in the preceding section, we 
observe that for cach fixed n the variable hn in the AR(1) model can be treated 
as its counterpart hp in the MA(q) model with q = n — 1. Slightly abusing the 
language one says sometimes that ‘the AR(1) model can be regarded as the MA(co) 
model’, 

In AR(1), the case |aj| = 1 corresponds to a classical random walk (cf. Chap- 
ter I, §2a, where we discuss the random walk conjecture and the concept of an 
‘efficient’ market). For example, if a] = 1, then 


hn = aon + ho + o(€1 +--+ + En). 


Hence 
Eh, = aon + Eho 


and 


Dhn = o2n > œ as noo. 


The case of |a1| > 1 is ‘explosive’ (in the sense that both the expectation Eh», 
and variance Dh, increase erponentially with n). 
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3. We now consider the case p = 2: 
hn = a9 + a1hn—1 + aghn_2 + GEn. (11) 


Using the above-mentioned operator L we can rewrite this difference equation as 
follows: 
(1 — aL — ag Lk )hy = ao + én. (12) 


If ag = 0, then the corresponding equation 
(1—a,L)hn = a9 + 0€n (13) 


fits into the case AR(1) we have already discussed. 
We now set Wn = a9 + g€n. Then (13) assumes the following form: 


(1 —aiL)hn = wn. (14) 
It is natural to look for a ‘conversion’ of this relation enabling one to determine the 
hn from the ‘input’ (wn). 
Based on the properties of L (see §2a.2) we can see that 


(1+ aL +a? L? +... + aL) — aL) = (1— aft E+), (15) 


We now apply the operator 1+ aL + L? ++ aX LF to both sides of (13). In 
view of (15), 


hn = (V+ aL + 08L? +- t aFL*)wy + okt Lh 1h,,, (16) 
Setting here k = n—1 and Wn = ag + En we obtain 
hn = (a0 + oEn) + alao + oEn-1) +: + at (ao +oE1) +aïho, (17) 


which is precisely the representation (7) above. 
By (16), bearing in mind (14) with k = n — 1 we obtain 


hn = (1+ aL +a? L? +---+a? TTL LA — ay L) hn + aPho. (18) 

If |a,| < 1 and n is sufficiently large, then we have the approximate equality 
hn (1+ a b+ of 2? +--+ + a? TEP = ai L)hn. (19) 
This suggests that a natural way to define the inverse operator (1 — apby-4 
is to consider the limit (in an appropriate sense) of the sequence of operators 


1l+a,L 4 ale te-++ atl" as n => œ. (Cf. the algebraic representation 
(l= z)-? =ltzt22t--- for |z| < 1.) 
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These heuristic arguments can be formalized, e.g., as follows. 
We consider a stationary sequence h = (hn), where 


fin = > afwn-j (20) 


(the series converges in mean square). It is easy to see that (hn) is a solution of (14). 
We claim that this is the unique stationary solution with finite second moment. 
Let h = (hn) be another stationary solution. Then 


k 
j k 

hn = Š a wn-j tat hy (641) (21) 

j=0 

by (16), therefore 

ko 2 

Elin S alwn =a Eh? 41) 0 ERG > 0 (22) 
g=0 


as k > œ. 
Hence we obtain by (20) that equation (13) has a unique stationary solution 
(with finite second moment). 


4. The above arguments show the way to the ‘conversion’ of difference equation (12) 
enabling one to determine the elements of the sequence (hn) by the elements of (wn). 
Since 


(1 ~ MLA- àL) =1- (A1 + Ag) L + Aro? (23) 


for any Aj and Ag, it follows that if we choose Ay and Ag such that 


Ay t+A2 = 41, 7 
à1à2 = —02, (24) 
then 
1— aL- agb? = (1 — AYL)(1 — AQL). (25) 
It is clear from (24) that Az and Ag are the roots of the quadratic equation 
A? — aA — a2 = 9, (26) 


1.e. 


ay + \/a? + 4a? ai — \/a? + dae 
—_—§— and Ag = ————_... 


À = 
g 2 2 
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To put it another way, A, = Zz and Ag = A where z] and zg are the roots 
of the equation 


l- aiz -az =0, (27) 
while the polynomial 1 — a,z — a22? if the result of the substitution L > z in the 


operator expression 1 — a] L — a2L?. 
In view of (25), we can rewrite (12) as follows: 


(1 — MDA Dh, = wn. 


After the formal multiplication of both sides by (1 — AgL)71(1 — A, L)7! we see 
that 


hn = (1— AL) 1A — A L) lwn. (28) 


If Ay Æ à2, then we formally obtain 


1 = Al A2 
er Se Ge Ne) Tb aed 
Aaa aan tog (29) 

therefore 

v1 A2 
hn = uN (1—12) wn — vase (1— A22)? (30) 


If |A;| < 1, 7 = 1,2, i.e., if the roots of the characteristic equation (27) lie outside 
the unit disc, then 


(Va Ab) SH 1p yb NL H, i= 1,2, (31) 


and therefore, in view of (30), the stationary solution (with finite second moment) 
of (12) has the following form: 


= an + c2A2) Wn—j, (32) 
j=0 
where 
as Al DE AQ 
Lo ess a Ne A 


(The uniqueness of the stationary solution to (12) can be proved in the same way 
as for (13).) 


132 Chapter II. Stochastic Models. Discrete Time 


5. Finally, we proceed to general AR(p) models, in which 
hn = ao + aihn-1 + +--+ @phn—p + Gen, (33) 
that is (setting Wn = ag + cen), 


(1 -aL — azl? —--+ — apLP)hn = wn. (34) 


Using the same method as for p = 1 or p = 2, we consider the factorization 


1— aL- ag? —-++~ apL? = (1—AL)(1— AgL)---(1—ApL) (35) 


and assume that the A;, i = 1,...,p, are all distinct. 

If |A| < 1,7 = 1,...,p, then we can express the stationary solution of (34), 
which is the unique solution with finite second moment starting at —oo in time, as 
follows: 

hn = (1 — Ay L)7}--- (1 — Apl) twn. (36) 


We note that the numbers Aj, A9,...,Ap are the roots of the equation 


AP — aj APTI — «+» — ap-1À ~ ap = 0 (37) 


(cf. (26)). Putting that another way, we can say that A; = Ze, where the z; are 
the roots of the equation 1 — a,z — a22? ~ --- — apz? = 0 (ef. (27)). Hence we can 
obtain a stationary solution k = (hn) if all the roots of this equation lie outside the 
unit disc. 

To obtain a representation of type (30) or (32) we consider the following analogue 
of the expansion (29): 


1 Cj Cp 
= era ; 38 
(1 — Aiz) (1-— àpz) 1l—Aiz l- Apz 8) 


where ¢j,...,¢p are constants to be determined. 
Multiplying both sides of (38) by (1 ~ Ayz)---(1—Apz), we see that for all z we 


must have 5 
1 Noe [{ G- Ak?) (39) 


1=1 1gk<p 


heft 
This equality must hold, in particular, for z = Nar acne = Ae 4 which gives us 
the following values of c1,..., Cp: 
xe 
Ga t F (40) 
: Tick<pQi 73 Àk) 
ki 


(We note that c1 +--+- + cp = 1.) 


2. Linear Stochastic Models 133 


By (36), (38), and (40), 
hn = X laal +--+ Cpr\p)Wn—1; (41) 


which is a generalization of (32) to the case p > 2. 
The representation (41) enables one to evaluate various characteristics of the 
sequence h = (hn) such as the moments Eh}, the covariances, the expectations 
E(hnik | Fn) (where Fn = o(...,w-1,Wo,.--,Wn)), and so on. 
Assuming that we have a stationary case it is easy to find the nioments Ehn = p 
directly from (33): 
= a0 
K= TE (ai +--+ ap) 


(42) 
As regards the covariances R(k) = Cov(hn, hn}k), we see easily from (33) that 
R(k) = aiR(k —1)+---+apR(k — p) (43) 

for k =1,2,.... If k = 0, then 
R(0) = ajR(1) + -- -+ apR(p) + 0°. (44) 


The correlation functions p(k), k > 0, satisfy the same equations (43) and (44) (they 
are called the Yule- Walker equations, after G. U. Yule and G. T. Walker; [271]). 


6. One of the central issues in the statistics of the autoregressive schemes AR(q) 
is the estimation of the parameters 8 = (aọ,a1,..., ap, g) involved in (1) and (2), 
where we now assume for definiteness that ho, h1,... are known constants. 

If we assume that the white noise € = (en) is Gaussian, then the most important 
estimation tools is the mazimum likelihood method in which the quantity 


a 


On = arg max pg (hı, h2,..., hn) 


is the estimator for the parameter @ defined in terms of the observations 
hi,ho,...,hn (here polhi, h2,...,hn) is the joint probability density of the 
(Gaussian) vector (h1, h2,...,Rn)). 

We shall illustrate the general principle pertaining to the maximum likelihood 
method by the example of the AR(1) model with hn = ao + ajhyn_1 + 0€n; here we 
regard g > 0 as a known parameter, ho = 0, and n > 1. 

Since 6 = (ag, @1), it follows that 


1 = Vie (hp — ao ~ aihk—ı 2y 
polhi: -> hn) = (ss) exp{ 2 D : 20? \ 
k=1 
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It is easy to see that the value of the estimator 6 = (@9,@1) can be found from 
the minimality condition for the function 


nm 
(ao, a1) = So (hp — a9 — arhg_1)?. 
k=l 


Letting ap € R and a; €E R, we obtain 


dy £ 
aa 0 < 2 X (hk — ag — ayh,_1) = 0, 
0 
k=1 
a 7 (45) 
F 0—2 X (hk — a9 — aihk-1)hkk—-1 = 0. 
k=l 


Solving this linear system one finds the values of the estimators @p and @j. 
We now concentrate on the properties of the estimators @1=@,(hj,...,hn), 
n > 1, for a}. We assume for simplicity that ag is a known parameter (ag = 0) and 
o=1. 
Under these assumptions, hn = ajAkn—1 + En and 
~ P k=1 Ëk-ihk 
S- en a 
2 k=1 hk- 
by (45). Hence 
J k=1 Ëk-1Ek 


ay =ayt+ n 5 
k=l yy 


We now set a 
Mn = 5 hk—1Ek- 
k=1 
The sequence M = (Mn) is (for each value of the parameter a1) a martingale with 
quadratic characteristic 


n 
2 
(M)n = >> h} 
k=1 
Hence if a; is an actual value of this unknown parameter, then 
Se Mn 
@1 =a,+—. 46 
1+ Gre (46) 


From this representation we see that the maximum likelihood estimator @ is 
strongly consistent: @1 —> a, as n — œ with probability one because (M)n, > œ 
(P-a.s.), and by the strong law of large numbers for square integrable martingales 
(see (12) in § 1b and [439; Chapter VII, § 5]), 


Mn 


My +0 (P-a.s.). 
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Calculating in our case (ao = 0, o = 1) the Fisher information 


-Pos Pa (asta) | 
dat * 


In(ai) = En { 


where Ea, is the expectation with respect to the measure 
Pa, = Law(hy,..., hn |8 = a1), 


we obtain A 
T,(a1) = Ea, (M)n > Ea; Sa, h? ois 
k=1 
Using the above formulas for the Ehy_; and Dhy_1, we see that 


ear jai] <1, 
n2 
In{a)~ 9 >> la| = 1, (47) 
2n 
a 
1 
aj] >1 
We see that |a1| = 1 is some distinguished, ‘boundary’ case, when h = (hn) 


is a random walk (cf. Chapter I, §2a “Random walk conjecture and concept of 
efficient market”). For |a;| Æ 1 the corresponding sequence h = (hn) is Markovian. 
Moreover, if |a1| < 1, then the sequence h = (hn) ‘approaches a steady state’ as 
n> oOo. 

This fact is responsible for the considerable attention that is paid in the statistics 
of the sequences h = (hn) to the solution of the following problem: which conjecture 
of the two, 

Ho: |a| =1 or Ay: jai] > 1, 


is more likely? 

In the econometrics literature and, in particular, in the literature on financial 
models the range of issues touching upon the validity of the equality |a| = 1 is 
called ‘the unit root problem’. 

We now present (without proofs; referring instead to [424], [445], and the litera- 
ture cited there for greater detail) several results concerning the limit distribution 
for the deviations @] — aj. 


THEOREM 1. Asn — œ we have 
(z), jai] <1, 


lim Pay { V Jn (a1) (@, —a1) S r} zA Ha,(z), ja1]=1, (48) 
Ch(z), jai] > 1, 
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z 


where ®(x) = i. Pio 1) (9) dy is the standard normal distribution, Ch(z) is the 


Cauchy distribution with density and Hg, (2) is the distribution of the 
T 


W(1)-1 
al I 
2 5) as 
ava f W?(s) d 


where (W(s))s<1 is a standard Wiener process (Brownian motion). 


random variable 


> 


It is interesting that if |a;| 4 1, then the densities of the limit distributions 
are syrnmetric with respect to the origin. However, the density of the distribution 
Hg, (2) is asymmetric if \a1| = 1. (This is an easy consequence of the observation 
that P(W?(1) - 1 > 0) #4.) 

The next result shows that, considering a random normalization. of the deviation 
(@; — a1) (which means that we use the stochastic Fisher information (M), in place 
of the Fisher information In(a1)) we can obtain only two distinct limit distributions 
rather than three as in (48). 


THEOREM 2. Asn — œ we have 


lin Pa; {V(M)n (@, — a1) S z} = { 


G(x), = |ai| A 1, 
Hi, (z), lail=1, 


where H}, (x) is the probability distribution of the random variable 


W2(1) -1 
n= er 
2 ! 
2 I W+ (s) ds 


Finally, the sequential maximum likelihood estimators together with random 
uorinalization bring us to a unique limit distribution. 


THEOREM 3. Assume that 0 > 0, let 
7(0) =inf{n > 1: (M)n > 9}, (50) 


and let 


be the sequential maximum likelihood estimator. Then 
AiL Pa { V(M),(6) [@1(7(0)) — ar] <x} = B(x) (52) 


for each a, E R. 
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We now compare this result with assertion (48). 

If n was our original time parameter, then we can regard @ as ‘new’, ‘operational’ 
time defined in terms of stochastic Fisher information (MM). We note that if Fisher 
information changes only slightly on (large) time intervals, then they correspond to 
sinall intervals of ‘new time’ @ and conversely. Thus, using this ‘new’, ‘operational’ 
time we nake the flow of information more uniform, homogeneous, the incoming 
data are now of ‘equal worth’, and are ‘identically distributed’ in a sense for all the 
values of ay. Eventually, this results in the uniqueness and normality of the limit 
distribution. 

We come across problems related to this ‘new’ time below, in Chapter IV, § 3d, 
where we explain in close detail one such change of time aimed at ‘flattening’ 
the statistical data on currency cross rates, in the dynamics of which ‘geographic’ 
components of the periodic nature are clearly visible. 


7. We now present several additional results on the properties of maximum likeli- 
hood estimators (see [258], [424], and [445]). 
First, 
sup Eq,|@1 — a1] 70 as n> œ. 
a, ER 
Second, let (a1) be the class of estimators @1 with bias ba, (@1) = Ea, (%1 — a1) 
satisfying the conditions 


ba, (@1) +0 and Pa (41) 70 as noo. 
1 


(The maximum likelihood estimators @ are in the class U (a1) if |a1| 4 1.) 
If jai] # 1, then the maximum likelihood estimators @1 are asymptotically 
efficient in U(a1) in the following sense: for a] € U(a1) we have 


—— Ea, (M)n(@1 — a1)? 
lim a ( )n(@ iE < 
n Ea, (M)n(a1 = a1) 

If |a1| < 1 (the ‘stationary’ case). then the estimators @1 are also asymptotically 
efficient in the usual seuse, i.e., for all ay € U (a1), 


Sequential maximum likelihood estimators have also the property of asymptotic 
untformity: as 0 > oo, we have 


sup sup|Pa, { V(M) (9) [@1(7(8)) — a1] <a}- &(z)| > 0,. 


lai/<1 


sup sup|Pa, {V (M),-(6) [a1(r(0)) — a1] <x} — B(2)| > 0. 


l<r<laij<R £ 
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§2c. Mixed Autoregressive Moving Average Model ARMA(p. q) 
and Integrated Model ARIMA(p, d, q) 


1. These models combine the properties of the already considered MA(q) and 
AR(p) models, thus often providing a fair opportunity to find a model that can 
well explain the probabilistic backgrounds of statistical ‘stock’. 

As above, we consider a filtered probability space (Q, ¥,(¥,,).P). It is now 
convenient to assume that Fp = o(...,€-1,€0,€1,---,€n), where e = (en) is ‘white 
noise’ (in the strict sense). 

By definition, (see § 1d), a sequence h = (hn) is governed by the model ARMA 
if 

hn = Un + G€n, (1) 


where 
Hn = (a9 + aihn~1 + +++ + aphn-p) + (b1en-1 + b2En-2 +*+: + bgEn-q) (2) 


Without loss of generality we can assume that the value of ø is known to be 
equal to one: ø = 1. Then by (1) and (2) we obtain 


hn — (aihn-1 FUSE aphn-p) = a0 + lEn +b1En—1 + b2En-2 +*+ bgén—q] (3) 


or 
a(L)hn = ao + B(L)en, (4) 
where 
a(L) =1-a,L—-+-~apL? (5) 
and 
B(L) =1+b1L +: + baL. (6) 


We note that if q = 0, then 
a(L)hn = wn, 


where wn = ag + En, i.e., we arrive at the AR(p) model (cf. (5) in § 2b). 
On the other hand, if p = 0, then (3) takes the following form: 


hn = ao + B(L)En, (7) 


i.e., we obtain the MA(q) model (cf. (4) in § 2a). 
Considering a formal conversion of (4) we see that 


ape 8D 
hn = pet aL)” (8) 
where A 
H £ (9) 


2. Linear Stochastic Models 139 


We now cousider the question on the existence of a stationary solution of equa- 
tion (3) (in the class Z2). By (8) and in view of our previous discussions (in § 2b.5), 
the answer to this question depends on the properties of the operator a(Z), that 
is, of the autoregressive components of our model ARMA(p, q). 

If all the roots of equation (37) in § 2b are smaller than one in absolute values 
(and in this case a1 +-+++@p # 1), then this model has a unique stationary solution 
h = (hn) (in the class Ly 

For this stationary solution, 

ao 


Ehn = 10 
in 1 — (a1 + +++ + ap) ( ) 


by (8) (cf. (42) in § 2b). 
It is easy to couclude from (3) that the covariance R(k) = Cov(hn, hy +,) satisfies 
for k > q the same relations 


R(k) = a ,R(k — 1) +: + apR(k — p) (11) 
as in the AR(p) case (cf. formula (43) in § 2b). 
If k < q, then the corresponding representation of R(k) has a more complicated 


form than (11), since we must also take into account the correlation dependence 
between €,_,% and hy p- 


2. As an illustration, we consider the model ARMA(1,1), which is a combination 
of AR(1) and MA(1): 


hn — ailn—1 = 00 + En + b1en-1- (12) 
We assume that |a| < 1 (the ‘stationary’ case). Then 
a(L)= 1- aL, B(L)=1+bL (13) 
and (8) can be written as follows: 


ao 1+ bb | 
l-ai l~ aL 4 


oO 
2s + (© aft) + by L)en 
— al 
k=0 


xX 
ag k-1 
= BEA + (+h) 1 En—k t+ En. (14) 


hn = 
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FIGURE 20. Computer simulation of a sequence h = (hn) governed 
by the ARMA(1,1) model with hn = ap + ayhn—) + biEn—1 + Gen 
(0 < n < 1000), where ao = —1, a) = 0.5, b = 0.1, o = 0.1, and 
ho = 0 


Hence we immediately see that the covariance R(k) = Cov(hn, hnk) satisfies 
the following relatious: 


aR 
R(1) = a,R(0) + b1, (15) 
R(0) = a1R(1) + (1 + a6; + 69), 


and therefore 
1 + 2a1b1 +b? 


R(0) = Dhn = 
(0) = Dhn = “EES 
ae R(k) (1+ ayb1)(a1 + b1) 
+ &101)(01 +01) k-1 
k)= = F 16 
lk) R(0) 14+ 2ajb +02 1 18) 


It should be mentioned that the correlation decreases geometrically as k => oo 
for |a| < 1. This must be taken into account in the adjustment of the ARMA(1, 1) 
models (or the more general ARMA(p,q) models) to particular statistical data. 


3. The above-considered models ARMA(p, q) are well understood and used mostly 
in the description of stationary time series. On the other hand, .if a time se- 
ries Z = (£n) is nonstationary, then the consideration of the differences Azn = 
In — £n] or the d-order differences A’z,, brings one to the (occasionally) ‘more’ 
stationary sequence Ata = (Ma): 
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There exist special expressions describing this situation: one says that a sequence 
x = (£n) is governed by the ARIMA(p,d,q) model if Afr = (Atxp) is governed 
by the model ARMA(p,q). (In a symbolic form, we write A! ARIMA(p, d,q) = 
ARMA(p, q)-) 

For a deeper insight into the meaning of these models we consider the particular 
case of ARIMA(0,1,1). Here At, = hn, where (hn) is a sequence governed by 
MA(1), i.e., 

Ag, = e+ (bo + bi Len. 


Let S be the operator of summation (‘integration’) defined by the formula 
S = A~}, or, equivalently, 
S=14L4+P?4---=(1- £77}. 
Then we can formally write 
In = (Sh)n, 


where hn = p + (bo + bi Den = wt boEn + b1En-1- 

Hence x = (£n) can be regarded as the result of the ‘integration’ of some se- 
quence h = (hn) governed by the MA(1) model, which explains the name ARIMA = 
AR+ I+ MA. (Cf. Fig. 18 and Fig. 21.) 


100 
90 
80 
70 
60 


+ + + 


0 10 20 30 40 50 60 70 80 90 100 


FIGURE 21. Computer simulation of a sequence x = (tn) governed 
by the ARIMA(0, 1,1) model with Azn = u + bén_1 + boEn, where 
w= 1, bı = 1, bọ = 0.1, and zo = 0 


We have already mentioned that these models are widely used in the Box- 
Jenkins theory [53]. For information about their applications in financial data 
statistics, see, e.g., [351]. 
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§ 2d. Prediction in Linear Models 


1. We pointed out in the introduction to this section that the construction of 
probabilistic and statistical models (based on the ‘past’ data) is not an end in 
itself; it is necessary, in the long run, to predict the ‘future’ price movements. 

It is very seldom that one can give an ‘error-proof’ forecast using the ‘past’ 
data. (This is characteristic, e.g., of the so-called singular stationary sequences; see 
subsection 4 below and, in more detail, e.g., [439; Chapter VIJ.) 

The typical situation is, of course, the one when making a forecast we are in- 
evitably making an error, the size of which determines the risks involved in the 
solutions based on this forecast. 


2. In the case of stationary linear models there exists a well-developed (and beau- 
tiful) theory of the construction of optimal linear estimators (in the mean-square 
sense), which is mainly due to A. N. Kolmogorov and N. Wiener. 

We have already seen that many of the sequences h = (hn) considered can be 
represented as one-sided moving averages 


o0 
hn = 5 aken—k; (1) 
k=0 


lee) 
where Ð Jal? < co and e = (en) is some ‘basic’ sequence, white noise. (Sece 
k=0 
formula (15) in § 2a for the MA(q) and MA(oo) models, formula (41) in § 2b for the 
AR(p) models, and (8) in § 2c for ARMA(p,q).) 

To describe results on the extrapolation of these sequences we require certain 
concepts and notation. 

If £ = (ĉn) is a stochastic sequence, then let FÈ = ø(...,En—1. n) be the ø- 
algebra generated by the ‘past’ {Ek, k < ny}, let F = VF be the o-algebra 
generated by all the variables ĉn, let He = ¥(...,6:-1: ĉn) be the closed (in Z?) 
linear manifold spanned by the variables {€,, k < n}, and let H$, = L(Ek, k <œ) 
be the closed linear manifold spanned by all the variables £n. 

Let. 7) = 7(w) be a random variable with finite second moment En? (w). We now 
formulate the problem of finding an estimate of 7) in terms of our observations of 
the sequence £. 

The following two approaches are most widely used here. 

If these are the variables (...,€,-1, En) that we must observe, then in the frame- 
work of the first approach, we consider the class of all F$ -measurable estimators 7p, 
and the one said to be the best (optimal) is the estimator 1, delivering the smallest 
mean-square deviation, i.e., 


Eln — |? = inf E]n - Hl. (2) 
Tn 
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As is well known, the estimator optimal in this sense must have the following 
form: 


în = E(n| F$), (3) 
where E(-|-) is the conditional expectation, which is, generally speaking, a nonlin- 
ear function of the observations (..., &:—1,&n). (The issues of the nonlinear optimal 


filtering, extrapolation, and interpolation for fairly wide classes of stochastic pro- 
cesses are considered, e.g., in [303].) 

In the framework of the other approach, on which we concentrate now, one 
considers only the set of linear estimators together with its closure (in ZÊ), ie., 
functions of (...,&:—1, En) belonging to H$. 

In a similar way to (2), by the best (optimal) linear estimator of the variable 7 
we mean An E€ HÉ such that 


Eln —An|? = _inf Eln — An|?. (4) 
Xn CHE 


In this case we use the following notation for An (cf. (3): 
Xn = E(n| H$), (5) 


where E( -|+) is called conditional expectation in the wide sense. 


3. We now return to the issue of (linear) prediction on the basis of the ‘information’ 
available by time 0 for the variables in a sequence h = (hn), n > 1, described by (1). 
This problem can be solved relatively easy if the ‘information’ in question means 
all the ‘past’ generated by € = (€n)nco (rather than by h = (An)ngo), ie., 


H = L(...,€-1,€0)- 


Indeed, 
n—-1 oo 
AO = E(hnlHé) = e(> akEn-k + Y Oken—k | H5) 

k=0 k=n 

o0 2 n-1 o0 

= $ arene + e( AkEn—k | Hâ) = $ akEn-ks (6) 
k=n k=0 kan 
since 


E(e;| HE) = Ee; (= 0) 


for 7 > 1, which follows easily from the orthogonality of the components of the 
white noise € = (£;). 
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Moreover, it is not difficult to find the extrapolation error 


n—-1 


a? =Elln -Ê (hn | HE)? = Dales 


Clearly, 02 < a? 41 and 


ie 2) 


limon, = X Jakl? (= Eh): 


k=0 


Hence the role of the ‘past data’ Hj = L(...,€n—-1,€0) in our prediction of the 

values of hy diminishes with the growth of n and, in the limit (as n — oo), we 

should take the mere expectation (i.e., 0 in our case) as the best linear estimator. 
Of course, the problem of extrapolation on the basis of not necessarily 

infinitely many variables {e,, k < 0}, but finitely many ones (e.g., the variables 

{ek | < k < O}) is also of interest. It should be noted, however, that this latter 

problem is technically more complicated that the one under consideration now. 
The representation 


oo 
KO = XO akEn-k (7) 
k=n 


so obtained solves the extrapolation problem for the hn on the basis of the (linear) 
‘information’ H§ rather than the ‘information’ HÈ contained in 2(...,h—1, ho). 
Clearly, if we assume that 
He = HÈ, (8) 

then using (7) we can (in principle, at any rate) express the corresponding estimator 
P OO 
E(hn| HÈ) in the form of a sum Ð @ghn_p- 

k=n 
4. In view of our assumptions (1) and (8), it would be useful to recall several 
general principles of the theory of stationary stochastic sequences. 


Let £ = (En) be a stationary sequence (in the wide sense). 
We can represent each element 7 € H(€) as a sum of two orthogonal components: 


= E(n| $(€)) + [n — E(n| S(®)], 


where S(£) = Nn H$ is the (linear) information contained in the ‘infinitely remote 
past’. Let R(E) be the set of elements of the form 7 — E(n | S(£)), where n € H(€). 
Then we see that H(€) can itself be represented as an orthogonal sum: H(€) = 


S(€) @ R(E). 
We say that a stationary sequence € = (En) is regular if H(£) = R(€) and 
singular if H(€) = S(é). 
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The meaning of the condition H(€) = S(€) is fairly clear: this indicates that all 
the information provided by the variables in the sequence € ‘lies in the infinitely 
remote past’. This explains why singular sequences are also said to be purely or 
completely deterministic. If S(€) = Ø (ie., H(€) = R(€)). then one says that the 
sequence € is purely or completely nondeterministic. 

The following result sheds light on the concepts that we have just introduced. 


PROPOSITION 1. Each nondegenerate stationary (in the wide sense) random se- 
quence £ = (n) has a unique decomposition 


En a ri T En 


where £" = (£7) is a regular sequence and E° = (€>) is a singular sequence; more- 
over, &" and €* are orthogonal (Cov(€?),, &,) = 0 for all n and m). 
(See [439; Chapter IV. $5] for greater detail.) 
We now introduce another important concept. 
DEFINITION. We call a sequence € = (en) an innovation sequence for € = (Ep) if 
a) € = (En) is ‘white noise’ in the wide sense, i.e., Een = 0, Eeném = 0 for 
n Æ m, and Ee? =1; 
b) HÅ = HE for each n. 


The meaning of the term ‘innovation’ is clear: the elements of the sequence 
E = (en) are orthogonal, therefore we can regard €n+1 as (‘new’) information that 


‘updates’, ‘innovates’ the data in Hj, and enables one to construct Hi, 1. Since 


He = Hf, for all n, én+41 updates also the information in HE. (Cf. Chapter I, § 2a 
where we consider the ‘random walk’ conjecture and discuss the concept of ‘efficient 
market’.) 

The next, result. establishes a connection between the concepts introduced above. 


PROPOSITION 2. A nondegenerate sequence € = (ên) is regular if and only if there 
lee) 
exist an innovation sequence € = (En) and numbers (ap)k>0 such that >> |az|? < œ 


and (P-a.s.) 
En = 5 QpEn—k- (9) 
k=0 


(See the proof, e.g.. in [439; Chapter VI, §5].) 
The following result is an immediate consequence of Propositions 1 and 2. 


PROPOSITION 3. If € = (n) is a nondegenerate stationary sequence, then it has a 
Wold expansion 


En = E+ D akEn-ks (10) 


k=0 
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00 
where » at < 00 and £ = (£n) is some innovation sequence (for £"). 
5. Assuming that the sequence h = (hn) has a representation (1) with innovation 
sequence (for h) € = (en) we obtain 


fin = E(hn | HÈ) = E(hn | H6) (= R9). 


Hence Rs 3 
An = 5 âkEn-k = 5 a hno (11) 
k=n k=n 


by (6), where the last equality is a consequence of the relation HE = H? holding 
for all n. 

Formula (11) solves the problem of optimal linear extrapolation of the variables 
hn on the basis of the ‘past’ information {hy, k < 0} in principle. However, the 
following two questions come naturally: when does a sequence h = (hn) admits 
a representation (1) with innovation sequence € = (£n) and how can one find the 
coefficients @, in (11)? 

The answer to the first question is contained in Proposition 2: the sequence 
h = (hn) must be regular. It is here that the well-known Kolmogorov test is 
working: if h = (hn) is a stationary sequence in the wide sense with spectral density 
f = f(A) such that a 

I In f(A) dA > =œ, (12) 
er 


then this sequence is regular (see, e.g., [439; Chapter VI, §5]). 


6. We recall that each stationary sequence in the wide sense € = (En) with Eé, = 0 
has a spectral representation 


[= J eñn z(dA), (13) 
where z = z(A) (with A € 2([-7r,7])) is a complex-valued stochastic measure 
with orthogonal values: Ez(A1)zZ(A2) = 0 if Ay N Ag = Ø (see, e.g., [439; Chap- 
ter VI, §3]). 

In addition, the covariance function R(n) = Cov(€x4n,&€,) has the representation 


R(n) = ‘a en F(d2) (14) 


T 


with spectral measure F(A) = E|z(A)|?. 
If there exists a (nonnegative) function f = f(A) such that 


F(A) = 1 fA) ad, (15) 
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then we call it the spectral density, while the function F(A) = F((—oo, A]) is called 
the spectral function. 

Hence if we know a priori that the initial sequence h = (hn) is stationary in 
the wide sense with spectral density f = f(A) satisfying condition (12), then this 
sequence is regular, and admits the representation (9) with innovation sequence 
E= (En). 

7. In the linear models considered above we did not define the sequence h = (hn) in 
terms of spectral representations; we used instead the series (1) with an orthonormal 
sequence € = (En) such that Een = 0 and Eeje; = 4;;, where 6;; is the Kronecker 
delta: 

dij =1 fori=j and bij =0 fori#j. 


Clearly. the sequence € = (en) is stationary and we have 
RESTI nag (16) 
} — 
eN 0, n£0 


for its correlation (= covariance) function and 


feA)= = -T LÀA<T (17) 


T 
n l 
for the spectral density. (Because Re(n) = l R dà.) 
=n: 


oO oO 

If hn = Y apen_p~ and Y` Jakl? < 00, then it is easy to see that the covariance 
c= k=0 

function of the sequence h = (hn) can be defined as follows: 


Ra(n) = f° PRA dA, (18) 
where 1 
fr) = 5 lee)? (19) 
and Z 
(z) = 5 ap2*. (20) 
k=0 


We now assume that the power series (20) has convergence radius r > 1 and 
® does not vanish for |z| < 1. Under this assumption we can answer the question 
on the values of the coefficients @, in the representation (11) for the optimal linear 
prediction hn as follows (see, e.g., (439; Chapter VI, § 6] for greater detail). 

Let s 
ian On(e7) 


B(e) ” (21) 


$n(A) =e 
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where 
co 
®,(z) = 5 ape”. (22) 
k=n 
We expand %,,(A) in the Fourier series: 


aln 


Gn(A) = a n) og an al), eli) 


veo ggi (23) 


alm) gi 


Then the coefficients a), ’, ûn]: are precisely those delivering the optimal linear 


prediction hn of hy: 


lin = E(hn | HÈ) = Ya hace (24) 


Thus, we have a method for constructing a prediction of hn on the basis of the 
‘past’ variables {...,h—1, ho}, provided that we can find the representation (23). 


8. To illustrate the above method we consider now several examples for the station- 
ary models: MA(q), AR(p), and ARMA(p.q). We shall indicate explicit formulas 
for the predictions only for several values of p and q. As regards general formulas, 
see, e.g., [439; Chapter IV, § 6]. 


EXAMPLE 1 (The model MA(q)). Let 


hn = p(L)en, (25) 
where 
B(L) = bo +b L +>- + bgL4, (26) 
i.e., 
q 
hn = 5 bkEn—k- (27) 
k=0 


Comparing this representation with (1) we see that a, = bp for O < k < q and 
ap = 0, k>q. 
Consequently, 


q 
k=0 
and 


q 
bpz* nog 
me- LH nsa 


0, n>q. 
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Hence 7 oo 
en Ak pp 


Gn(A) = etn k=n © 


a a 28 
yt ey (28) 


for n <q and Gn(A) = 0 for n >q. 

Thus, if n > q, then @,, = 0 for all the coefficients in the expansion (23). therefore 
the optimal prediction in this case is hyn = 0. This is far from surprising. for the 
correlation between hn and each of hg, h—1.... is equal to zero for n >q. 

If q = 1. then 

hn = boEn + b1en-1 


and y 
X . eb, by 
g1(A) = ae Zip. NE 
bo te by bo + e7 by 
Of course, we can assume from the very beginning that bọ = 1 and can set 


bı = 0, where |@| < 1, which ensures that the function ®(z) = 1 + 8z does not 
vanish for |z| < 1. Then 


oO oo 

A —iìk —iàk kok 

P(A) = os = pale Aj pas zi aa S Da 
=0 


Comparing this with (23) we see that, for q = 1 and n = 


a(t) - 
a(t) 98 


(1) _ (—1)F-1 6k 


Hence we obtain the following formula for the prediction of hı on the basis of 
the ‘past’ information (. ..,h—1, ho) in the model MA(1): 


hy = aM ho +a hy Spree +a hy, prea 


= O (ho — Oh_-y + O7h_g + + (-1)¥-19*-1hy_ 3, +++) 


pes STO thay 
k=0 


It is now clear that the largest contribution to out prediction of hı is the con- 
tribution of the ‘most recent past’, the variable hg; while the contribution from the 
‘preceding’ variables decreases geometrically (h_, enters the formula with coeffi- 
cient 6", where |6| < 1). 
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EXAMPLE 2 (The model AR(p)). We assume that (cf. (33) in § 2b) 
hn = aihn-1 +++ + 4phn—p +€n (29) 
for —co < n < œ; moreover, all the A; (i = 1,...,p) in the factorization (35) (§ 2b) 
are distinct and |A;| < 1. 
According to formula (41) in §2b, the stationary solution of (29) can be repre- 


sented as follows: 
lo @) 


hn = So (crt t+ + pAb )en—is (30) 
k=0 
i.e., in the representation (1) we have 
ap = CAE +++ + cpAk. (31) 


In principle, the prediction hn of hn on the basis of the variables hk, k <0, 
is described by (24), where the coefficients al”) can be found from the Fourier 
expansion of @p(A) (see (21)-(23)) by taking into account formula (31) for the 
coefficients a, in the definition of Gp, (A). 

Fairly many various methods described in the literature produce formulas for 
the predictious An that are convenient in applications (see, e.g., [211]. where one 
can find a description of a recursion procedure enabling one to construct hn from 
hiser hhi] 

We restrict ourselves to an illustration by carrying out a calculation for the 
stationary model AR(1): 


hn = bhn- F En (32) 
with |ð| < 1. 
We recall that R(n) = Cov(hy. hk+n) can be defined in this case by the formula 
gE 
Hence it is easy to see that the spectral function f = f(A) is well defined and 
1 1 
FA) (33) 


T On Ji- 0e] 
Comparing (33) and (19) we see that 


1 oe 
2) = X (62) 
k=0 
Hence 
n(A) = 0” 


by (21), and therefore (see (23) and (24)) 
hn = 6" ho, n2i, 


ie., for a prediction of hn on the basis of {hy, k < 0} one requires only the value of 
ho. This is not surprising, given that k = (hn) is a Markov sequence. 
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EXAMPLE 3 (The model ARMA(p, q) with parameters p = q = 1). Setting ag = 0 
in equation (12) (§ 2c) we find the following formula for a stationary solution: 


oO 
hn = (a1 +01) X a] en-k tEn 
k=1 


lo 2) 


where |a| <1. Hence the coefficients in the representation hn = >> Gpen—p are 
k=0 
as follows: 
G@o=1, G, =(a1t+bi ak}, (34) 
Consequently, 
Co o0 
a1 + b1) k a, + by az 
(z) = ETER, aiz) =1+ 
@) 2 ay 2! ) al 1 — ajz 
and 3 P 
2 ay + by bk. ath il 
®,(z) = G,2* = aiz) = ajz 
n(2) 2 pnsan 2a qo a n 
aman d = 


for n > 1. In view of (21) and (22), setting z = e~” we obtain (for |b1| < 1) 


1+bı , (a1z)" -1 
Pnl A) = gtn a a ETE = (a + by ay 
at +b E 
1+ 4ta. t 14 biz 
oo 
=a?" (a, +61) X (—1)*(b1z)* 
k=0 


and (see (23)) 
ay” = at May + ba)(—1)0h. 


Hence 
În = a} "(ay + b1) {ho + (—1)bih-1 + bha pees } 
= at"? (a1 + b1) DENIA 
k=n 
by (24). 


Remark. As regards the prediction formulas for the general models ARMA(p, q) 
and ARIMA(p, d, q), see, e.g., [211]. 


3. Nonlinear Stochastic Conditionally Gaussian Models 


The interest. in nonlinear models has its origin in the quest for explanations 
of several phenomena (apparent both in financial statistics and economics as the 
whole) such as the ‘cluster property’ of prices, their ‘disastrous’ jumps and down- 

n 
Sn-1 
memory’ in prices, and some other, which cannot be understood in the framework 
of linear models. 

On the other hand, there is no unanimity as to which of the nonlinear models— 
stochastic, chaotic (‘dynamical chaos’), or other—should be used. Plenty of advo- 
cators are adducing arguments in favor of one or another approach. 

There can be no doubt that economic indicators, including financial indexes, are 
prone to fluctuate. 

Many macroeconomic indicators (the volumes of production, consumption, or 
investment, the general level of prices, interest rates, government reserves, and so 
on), which describe the state of the economy ‘on the average’, ‘at large’, do fluc- 
tuate, and so do also microeconomic indexes (current prices, the volume of traded 
stocks, and so on). Moreover, these fluctuations can have a very high frequency 
or be extremely irregular. This is known to occur also in stochastic and chaotic 
models; hence the researchers’ attempts to describe fluctuation dynamics, abrupt 
transitions, ‘catastrophic’ outbursts, the grouping of the values (the cluster prop- 
erty), and so on, by means of these models. 


falls, ‘heavy tails’ of the distributions of the variables hn = In , ‘the long 


Considering the behavior of many economic indexes (production volumes, the 
sizes of particular populations, or government reserves) ‘on the average’ one can 
discern certain trends, but this movement can accelerate or slow down; the growth 
can occur in cycles of sorts (periodic or aperiodic). 

Thus, an analyst of statistical data relating to the economy, finances, or some 
area of natural sciences or technology, finds himself in front of a rather nontrivial 
problem of the choice of a ‘right’ model. 
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Below we describe several nonlinear stochastic and chaotic models that are pop- 
ular in financial mathematics and financial statistics. We are making no claims of a 
comprehensive exposition, but are willing instead to present an ‘introduction’ into 
this range of problems. (As regards some nonlinear models, we can recommend, 
e.g., the monographs [193], [202], [461], and [462].) 


§3a. ARCH and GARCH Models 


1. Let (Q, F, P) be the original probability space and let € = (En )ny1 be a sequence 
of independent, normally distributed random variables (€n ~ (0,1)) simulating 
the ‘randomness’, ‘uncertainty’ in the models that we consider below. 

By Fn we shall mean the o-algebra o(€1,...,€n); we set Fo = {Ø, Q}. 

We shall interpret Sn = Sn (w) as the price (of a share or a bond, or an exchange 
rate, etc.) at time n = 0,1,.... Here time can be measured in years, months, ..., 
minutes, and so on. 

As already mentioned (§ 1d), to describe the evolution of the variables 

Sn 
hn = In z y (1) 


n—1 


R. Engle [140] considered the conditionally Gaussian model with 
hn = OnEn. (2) 


Here the volatilities on are defined as follows: 
P 
o =ao+ $ aih; (3) 
i=l 


where ag > 0, a; > 0, and ho = ho(w) is a random variable independent of 
E = (€n)n>1- (One often sets ho to be a constant or a random variable with 
expectation of its square chosen so that the expectations Eh2, n > 0, are constant.) 

We see from (3) that the on are (predictable) functions of hiis PEE ee More- 
over, it is clear that large (small) values of the h; imply large (respectively, small) 
values of a2. On the other hand, if h2 turns out to be large, while the values of 
the ‘preceding’ variables ae sey heey are small, then this can be explained by 
a large value of €n. This gives an insight into how (nonlinear) models (1)—(3) can 
explain such phenomena as the ‘cluster property’, 1.e., the grouping of the values 
of the hn into batches of ‘large’ or ‘small’ values. 

These considerations justify (as already mentioned in §1d) the name of 
ARCH (p) (AutoRegressive Conditional Heteroskedastic model) given to this model 
by R. Engle [140]. The conditional variance (‘volatility’) ¢2 behaves in this model 
in an extremely uneven way because, by (3), it depends on the ‘past’ variables 


2 2 
ae ge eee 


n-l> 
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2. We now consider several properties of sequences h = (hn)n>1, governed by the 
ARCH (p) model. For simplicity, we restrict ourselves to the case p = 1. (See, 
e.g., [202], [193], and [393] for a thorough study of the properties of the ARCH (p) 
models and their applications; we present one result concerning the occurrence of 
‘heavy tails’ in such models in § 3c (subsection 6).) 

For p = 1 (see Fig. 22) we have 


ot =agt ayh? 4, (4) 
The following simple properties of the hn = oné€n are obvious: 
Ekn = 0, (5) 
Eh? = œo + @1Eh}-1, (6) 
E(h? | Fn-1) = 02 = a9 + arh? 1. (7) 


-3 t + + + — + + + — 
0 10 20 30 40 50 60 70 80 90 100 
FIGURE 22. Computer simulation of a sequence h = (hn) governed 


by the ARCH (1) model with hn = Vag + &1hż_1 En (0 < n < 100), 
where ap = 0.9, a] = 0.2; ho = 3 


If 
0<a; <1, 


then the recursion relation (6) has the unique stationary solution 


therefore setting h? = 


i 0 Wwe obtain formula (8) for the Eh? n > 1. 
-a 
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Next, a simple calculation shows that 


En’ = Eo% Eef = 3Eo4 = 3E (œo + ah? _,)” 


= 3(a2 + 2aoa1Eh2_1 + af Ens_,) 


a 302 (1 +a) 


4 
ia 307Eh (9) 


n—i:’ 


Hence, assuming that 0 < a, < 1 and 3a? < 1, we can obtain the following 
‘stationary’ solution (Eh4 = Const): 


302 (1+ a1) 
(l-a,)(1- 3a?) 


Ent = (10) 


By (8) and (10), the ‘stationary’ value of the kurtosis is 


Ehf 6a? 


(Eh2)2 TTS 30%" 


It is positive, which means that the density of the ‘steady-state’ distribution of 
the variables hn has a peak around the mean value (and the larger al, the more 
distinguished is this peak). We recall that K = 0 for a normal distribution. 


Remark. The empirical value Ky of the kurtosis can be calculated from the values 
of hy, ha,..., hy by the formula 


N N 


Ky = a X (hk -iv (5 X (hk ~fyy?), ~ 3, 


k=1 k=1 


where hy = whi +-++++hy). Judging by the data from S. Taylor’s book [460], it 
is a rule rather than an exception that the kurtosis of financial indexes is positive. 
The cases of negative kurtosis are very rare in practice. As regards the values of 
the empirical kurtosis Ky in the cases when it is positive, one can consider the 
following table relating to gold and silver prices, the exchange rate of the British 
pound against the US dollar, and General Motors’ share price (calculated from the 
monthly averages): 


TABLE 1. 
Gold 1975-82 Ky =8.4 
Silver 1970-74 Kn = 8.4 
GBP/USD 1974-82 Ky =5A 
General Motors 1966-76 Eyn =42 
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3. For 0 < a1 < 1 the sequence h = (hn) with hn = onén is a square integrable 
martingale difference, therefore it is a sequence of orthogonal variables: 


Cov(lin, hin) = 0, nám. 


Of course, this does not mean that hn and hm are independent, because, as we see, 
their joint distribution Law(hn, hm) is not Gaussian for a; > 0. 

One can get an idea of the character of the dependence between hn and hm by 
considering the correlation dependence between their squares h2 and h2, or their 
absolute values |n| and [Aml]. 

A simple calculation shows that 


2 2 a y 
Dhn = ; (11) 


1- 30? l-ai 
and 
1+ 3a4 ae 
En? hn? = ——, -_* 
ier 1— 3a? l-a’ (12) 
therefore 
Cov(h2, h2 
p(1) = Corr(h?, h21) = Cov(ha hai) =q]. 
V Dh2 Dh? 
Further, 


EA? h? p = E [h p E(h2 | Fn-1)] = E [A2 pE (02e? | Fn-1)] 
= E[h2_,(ao + aih?_,)] = aoEh2_p + ai Eh?_jh?_, 


for k < n, which, gives the following simple recursion relation for p(k) = 
Cov(h2,h? 4) 


no 


,/Dh2 Dh? y 


in the ‘stationary case’: 


p(k) = aE? y + ayp(k — 1), 


so that 
p(k) = af. (13) 
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4. We pointed out in § 1d that the ARCH (p) models are intimately connected with 
(general) autoregressive schemes AR(p). For assume that we are considering the 
ARCH (p) model and let vn = h?, — 02. If Eh? < 00, then the sequence v = (vn) is 
a martingale difference (with respect to the flow (¥,)) and it follows from (3) that 
the variables £n = h2 are governed by the autoregressive model AR(p): 


In = Q0 + QlTn-1 + +++ + Aptn—p + Un. (14) 


with noise v = (vn) that is a martingale difference. 
If p = 1, then 
In = Q0 + A1Fpnj—-1 +m, (15) 


therefore the above formula (13) is the same as (10) in § 2b. 


5. The ARCH(p) models are also closely connected with autoregressive models with 
random coefficients used in the description of ‘random walks in a random medium’. 
To give an idea, we restrict ourselves again to p = 1. 
Then it follows from the relations hn = onén and of = œo + arh? that 


hn = Vao + &ihž_1 En. (16) 


We now consider the following first-order autoregressive model with random 
coefficients: 
Tn = Bytntn—-1 + Boðn, (17) 


where (m) and (ôn) are two independent standard Gaussian sequences. 
From the standpoint of finite-dimensional distributions the sequence z = (zn) 
where we (for definiteness) set zo = 0, has the same structure as the sequence 


T = (Čan) with 
in = y Be + Biz2_1E,, Fo =0, (18) 


and £ = (ĉn) a standard Gaussian sequence. 

Comparing (16) and (18) we see that h = (hn) and Z = (Tn) are constructed 
following the same pattern. Hence if B = œo and B? = qj. then the probabilistic 
laws governing the sequences h = (hn) and Z = (Zn) with ho = To = 0 are the 
same. 


6. We now make the ARCH (1) model slightly more complicated by assuming that 
the relations between the variables hn, n > 1 are as follows: 


hn = Bo + Bihn-1 + yao + azh? 1 En- (19) 


In this case, it is said that h = (hn) is governed by the AR(1)/ARCH(1) model 
or that h = (hn) satisfies the autoregressive scheme AR(1) with ARCH(1) noise 


2 
(Vao + ah En) nyt 
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This model is conditionally Gaussian, therefore we can represent the density 
polhi,..., hn) of the joint distribution Pg of the variables hj,...,hn for a fixed 
value of the parameter 8 = (a9, œ1, 80, G1) as follows (ho = 0): 


n 
po(hi,...,hy) = (2r). Il (ao + arh?) 
k=1 


n = 7 2 
X epf ; 3 (hk — Bo — Bih_—i) i (20) 
= 


&œ0 + arh? a 


As an example of the application of this representation we consider the problem 
of estimation (using the maximum likelihood method) of the unknown value of 6; 
under the assumption that all other parameters, 89, a9, and aj, are known. 

The maximum likelihood estimator Bi of G1 is the root of the equation 


dP, 
(a0 01 80:81) 
(hy, hn) = 0. 
dpi ( 1, ; n) 


In view of (20) and (19), we obtain 


= 21 
By 7; he, ( ) 
ka1 a0 +h? _, 

and 7 

sy n 
e 22 
Bi = Bi t+ Mn (22) 

where 


is a martingale and 
n a 
(M)n = D0 - 


k=-1 


23 
ao +h? (23) 


is its quadratic characteristic (cf. (46) in § 2b). 


Here (M)n — œ (P-a.s.), therefore — 0 (P-a.s.) by the strong law of 


n 
large numbers for square integrable martingales (see (12) in § 1b and [439; Chap- 
ter VII, §5]). Hence the estimators 81 so obtained are strongly consistent in the 
following sense: Pg(G1 —> 8) = 1 for 6 = (ao, a1, Bo, 8) with 8 E€ R. 
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7. We now consider the issue of the prediction of the price movement in the 
future under the assuinption that the sequence h = (hn) is governed by the 
model ARCH (p). 

Since the sequence h = (hn) is a martingale difference, it follows that 
E(hnim | Fn) = 0. Hence the optimal (in the mean-square sense) estimator 


Inti = E(intm | FË) = E(E(hn+m | Fn) | FÈ) 


is equal to 0, so that it seems reasonable to consider the prediction problem for the 
future values of nonlinear functions of hn+m; e.g., of Higa and |hn+m|. 
We have 


= h 2 2 h 
ha a = E(h? n+m | Fn) = El On nena | Fn) 
2 2 J h 
= El ECs yeaa | Firmei) | Fa] 
=Eloz nl (= iim) (24) 
(here FË = o(hy,...,hn)), therefore it is clear that this question on the predic- 
tion of the oes values of h2 m reduces to the problem of the prediction of the 
‘volatility’ Caen on the basis of the past observations ho, hi,..., hn. 
Since 


2 2 2 
Onm = 20 + MOn4m—1En+m-1) 


it follows by induction that 


2 = 2 2 2 
On+m = A + «a9 F Q1Tn+m-2En+m-2En+m-—1 


= = m-1 j 


= ag + a0 5 II ME jit) + On ll CAE igre i 


j=1 i=1 


Hence, considering the conditional expectation E( -| FÈ) and taking into account 
the independence of the variables in the sequence (£n), we obtain 


m~ TERA m—1 
= ae 2 hy — j 2 
eae = Tham = E(On+mi Fn) = % + a0 5 aj + hpa 
j=l 
lee ay" mp2 
= ag >——— +a hg. 25 
g Is Oy mae ( ) 
As we might expect, the estimators ee converge as m — œ (with probability 
a0 


one) to the ‘stationary’ value Eh? 
me -a&i 
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8. We recall that the intervals (u— øo, #+o) and (u— 1.650, p+ 1.650) are (with a 
certain degree of precision) confidence intervals for the normal distribution WV (u, o°) 
with confidence levels 68% and 90 %, respectively. 
Since 
Snim = Sne” t thntm (26) 


and 


E[(hnyi + + Raa)? | FE] SEO p1 | FË) +- t ER a (PY) 


— 


=e} 2 
= Ont) fp a ntm’ 


it follows that we can as a first approximation take the intervals 
J72 > as oe oe 
(Sne Onyi E Fons Snet oasa t HAm ) 


and 


NE ae V Ones id 
(Spe? 6 nti t +onim, Spett VR t *ensm ), 


as confidence intervals (with confidence levels 68% and 90%, respectively). 

Here it must be clear that writing about ‘first approximation’ we mean that 
the variables hj; are not normally distributed in general, and the question of the 
difference between the actual confidence levels of the above confidence intervals and 
68% (or 90%) calls for an additional investigation of the precision of the normal 
approximation. 


9. The success of the conditionally Gaussian ARCH (p) model, which provided ex- 
planations for a variety of phenomena that could be distinguished in the behavior of 
the financial indexes (the ‘cluster property’, the ‘heavy tails’, the peaks (leptokur- 
tosis) of the distribution densities of the variables hn, ...) inspired an avalanche of 
its generalizations trying to ‘capture’, explain several other phenomena discovered 
by means of statistical analysis. 

Historically, one of the first generalizations of the ARCH(p) model was (as 
already mentioned in § 1d) the generalized ARCH model of T. Bollerslev [48] (1986). 
Characterized by two parameters p and q, it is called the GARCH (p, q) model. 

In this model, just as in ARCH (p), we set hn = nEn. As regards the ‘volatility’ 
On, however, we assume that 


p q 
2 
on = a0 + 5 aih? Ea X BiTn-j» (27) 


where œp > 0, œ; > 0, and 8; > 0. (If all the 8; vanish, then we obtain 
the ARCH (p) model.) 
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The central advantage of the GARCH (p,q) models, as compared with their 
forefather, the ARCH (p) model, is the following (experimental) fact: in the adjust- 
ments of the GARCH (p,q) model to the statistical data one can restrict oneself to 
small values of p and q, while in the framework of the ARCH (p) model we must 
often consider uncomfortably large values of p. (In [431] the author had to use 
autoregressive models of order twelve. AR(12), to describe the monthly averages of 
S&P500 Index. See also the review paper [141].) 

One can carry out the analysis of the GARCH (p,q) models with ‘volatility’ on 
that is assumed to depend (in a predictable manner) both on the he is i <p, and 
the Caer j <q, ina similar way to our analysis of the ARCH (p) models. 

Omitting the details we present several simple formulas relating to the 


GARCH (1,1) model, in which 
hn = On€n and 02 = ao+ arh? + pioa (28) 


with ag > 0. œ È 0, and f; > 0. 
It is clear that 
Eh? = ag + (a1 + 81)Ehž 1 
and the ‘stationary’ value En2 is well defined for a1 + 81 < 1; namely, 
ao 
1-a- ĝi 
If 3a? + 20161 + Be < 1, then we have a well-defined ‘stationary’ value 


Eh? = (29) 


ee Ca 2 
Elin = (1 ~ a ~ Bi)(1 — B? — 2016; ~ 3a?)’ re 


and therefore, for the ‘stationary kurtosis’ we obtain 


ge Ent " ba? (31) 
(Eh2 )2 1- B2 — 20101 ~ 30? 


It is also easy to find the ‘stationary’ values of the autocorrelation function p(k) 


(cf. (18)): 

-Qil =O Bi B?) 
p01) = 1- 201; ~ 8? 
p(k) = (01+ BE p(1), k>. (33) 


Finally, we point out that we can generalize (25) to the case of the GA RCH (1, 1)- 
models as follows: 


, (32) 


2 
een 


go 2 h 
= Ohim = E(on4m | Fa) 
1 Ta ym 


= 4 yH anh + p102), 


=ao 


where y = a1 + ĝi- 
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10. Models in the ARCH family, which evolve in discrete time, have counterparts 
in the continuous time case. Moreover, after a suitable normalization, we obtain a 
(weak) convergence of the solutions of the stochastic difference equations charac- 
terizing ARCH, GARCH, and other models to the solutions of the corresponding 
differential equations. 

For definiteness, we now consider the following modification of the GARCH(1, 1) 
model (which is called the GARCH(1,1)-M model in [15}). 

Let A be the time step, and let H(4) = (H), k = 0,1,..., where H = 


HO) 4 hal) 4-6. +4 ni) and 


A A A 
Wn) = (oD)? + (lA ena (34) 


with eka ~ (0, A), a constant c, and 


A)\2 A 2 
(ofa) = a0(A) + GEN (BA) +1 (A)ei,_aa)- (35) 
We set the initial condition H = Ho and of) = go for all A > 0, where (Ho, co) 


is a pair of random variables independent of the Gaussian sequences (€,), A > 0. 
of independent random variables. 

We now embed the sequence (H(5), ¢(4)) = (Ho), NIE in a scheme with 
continuous time t > 0 by setting 


HA =H and of) =A (36) 
for kA gt < (k+1)A. 

In view of the general results of the theory of weak convergence of random pro- 
cesses (see, e.g., [250] and [304]) it seems natural to expect that, under certain con- 
ditions on the coefficients in (34) and (35), the sequence of processes (H‘4), o/4)) 
weakly converges (as A — 0, in the Skorokhod space D} to some diffusion process 
(H, 0) = (Hs, o1)i>0: 

As shown in [364], for 


ag(A)=apA, a(A)=Qa a p(A)=1-a (5) = BA, 


the limiting process (H, o) satisfies the following stochastic differential equations 
(see Chapter III, § 3e): 


dH, = co? dt + ot wi”, (37) 
do? = (ag — Bo?) dt + ao? awl), (38) 


where (W), W)) are two independent standard Brownian motions that are also 


independent of the initial values (Ho, 09) = (HO), of), 
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§3b. EGARCH, TGARCH, HARCH, and Other Models 


1. In 1976. F. Black noticed the following phenomenon in the behavior of financial 
indexes: the variables h,_; and on are negatively correlated: namely, the empiric 
covariance Cov(hn—1, on) is negative. 

This phenomenon, which is called the leverage effect (or also the asymmetry 
effect), is responsible for the trend of growth in volatility after a drop in prices (i.e., 
when the logarithmic returns become negative). This cannot be understood in the 
framework of ARCH or GARCH models, where the volatility ¢2, which depends 
on the squares of the h2; i > 1, is indifferent to the signs of the hn—j, so that in 
the GARCH models the values hn-; =a and h,_; = —a result in the same value 
of future volatility Gas 

To explain Black’s discovery D. B. Nelson [366] put forward (1990) the so-called 
EGARCH (p,q) (Exponential GARCH (p, q)) model, in which the ‘asymmetry’ was 
taken into account by means of the replacement of h2; = Go yee, in the GARCH 
models with linear combinations of the variables ¢,-; and |en—;|. Namely, we 
assume again that hn = on€n, but the on must now satisfy the following relations: 


P q 
2 
Ino? =a9 + >) ai en + 1 (lens = /2)] +) bijna? j. (1) 
j=1 


i=1 


(We note that y2/r = Eļen—i!.) 

Since hn-i = On—iE£n—i and on—i > 0, the signs of h,_; and en_; are the same. 
Hence if En—; = b > 0, then the corresponding term in of is equal to b(6 + y), while 
if en_; = -b < 0, then it is equal to b(—0 + 4). 


2. The EGARCH models are not unique in capturing the asymmetry effect while 
retaining the main properties of the GARCH family. Another example is the 
TGARCH (p,q) model (‘T’ as in ‘threshold’), which was suggested by the threshold 
models of TAR (Threshold AR) type. In this model, 


k 
hn = 5 Ia; (hn-a)(&0 + @ihn-1 + + aphn—p), (2) 
i=1 
where d is a lag parameter and Aj...., A, are disjoint subsets of R such that 


k 
YAR. 
t=1 
For instance, we can set 


(3) 


F { ağ + athy-1 + abhn—2 if hn—2 > 0, 
n= ip 
ag + a?hn—i + azhn—2 if hn—2 <0. 


(Such threshold models were thoroughly investigated in the monograph [461].) 


164 Chapter II. Stochastic Models. Discrete Time 


By definition (see [399]), a sequence h = (hy) is described by the TGA RCH (p, q) 
model if hn = onén, where 


q 
on = a 8 Solos + bahiz +2 les ot, + djoz_;] (4) 


and, as usual, rt = max(z,0) and z7 = — sores We do not assume in this 
model that the coefficients (and, therefore, the Tolari on) are positive, although 
o2 retains its meaning of the conditional variance E(h2. | Fh D 

Since 


hn = On€n = (os = on ex =. En) = los ex oh TEn] = lonen F Or oes 
it follows that 
has [ote a + On En] and hz, = [ones dO en]. 


These relations enable one to rewrite (4) as follows: 


p* p* 
On =00 + > ai(En—sot_pt+ >, bilEn-i) Tzi (5) 
i=l i=l 


where p* = max(p, q) and the functions a;(€,_;) and BilEn—i), are linear combina- 
tions of er į and Eni 
The study of such models runs into certain technical difficulties rooted in the 
lack of the Markov property. Nevertheless, in simple cases (say, in the case of 
p =q = 1) onc can analyze the properties of these models fairly completely. 
Indeed, let p = q = 1. Then 


On = 40 + faih} + bih] + [aot 1 + dio, als (6) 
or, equivalently, 
On = a9 + 4 (En— 1)o, > 1 + Bil€n- 1)Fp. On-1 (7) 
where as 
ai(€n-1) = ayer 1 + bye, 1 C1; (8) 


Or(En-1) = aen + bieh + di. 
If ag = 0, then 
ok = (ax(€n-1)) "of, + (BilEn—1)) o7 
On = (ai(En—1)) On 4 + (Bi(€n-1)) Op—4 


by (7). Hence it is clear that, with respect to the flow (Fn), the sequence 
(075.0 ,€n)n>1 is Markov, which enables one to study it by usual ‘Markovian’ 
methods. (Sec [399] for greater detail.) 


(9) 
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3. We now consider another phenomenon, the ‘long memory’ or ‘strong aftereffect’ 
in the evolution of prices S = (Sn)nso- 

There exists several ways to describe the dependence on the ‘past’ of the variables 
in a random sequence. In probability theory one has various measures of this 
dependence: ergodicity coefficients, mizing coefficients, and so on. 

For instance, we can measure the rate at which the dependence on the past in 
a stationary sequence of real-valued variables X = (Xn) fades away by the rate of 
the convergence to zero (as m — oo) of the supremum 


mpg WGK ates E€ A| X},...,Xn) — P(Xnim E A)|, 


taken over all Borel sets A C R. 

Of course, the (auto)correlation function is the standard measure of this depen- 
dence. 

It should be noted that, as shown by many statistical studies, financial time se- 
ries exhibit a stronger correlation dependence between the variables in the sequences 
[b] = (\hn|)n51 and h? = (h?),51 than the one attainable in the framework of 
ARCH or GARCH (not to mention MA, AR, or ARMA) models. 

We recall that, by formula (13) in § 3a, 


Corr(h2_,.h?) = ak witha; <1, 


in the ARCH(1) model, while the autocorrelation function p(k) for the 
GARCH(1.1) model is described by expressions (32) and (33) in the same § 3a. 
According to these formulas, the correlation in these models approaches zero at 
geometric rate (‘the past is quickly forgotten’). 

One often says that a stationary (in the wide sense) sequence Y = (Yn) is a 
sequence with ‘long memory’ or ‘strong aftereffect’ if its autocorrelation function 
p(k) approaches zero at hyperbolic rate, i.e., 


p(k) ~ ck ?, k > œ, (10) 
for some p > 0. 
This rate of decrease is characteristic, e.g., of the autocorrelation function of 
fractal Gaussian noise (sce Chapter III, § 2d) Y = (Yn)n>1 with elements 


Yn = Xn ai i a 


where X = (X¢)¢y0 is a fractal Brownian motion with parameter H, 0 < H < 1 
(see Chapter III, §2c). For this motion we have (see (3) in Chapter III, § 2c) 


1 - : | 
Cov(Xs, Xt) = 5 (le 42)? bss AEX? 
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and (see (3) in Chapter III, § 2d) 
2 
T H 
Cov(Van Yaga) = Z {1e + UP — 21k?" + jk — 1}, a) 


where o? = DY,. Hence the autocorrelation function p(k) = Corr(Yn. Yn+k) de- 
creases hyperbolically as k => œ: 


p(k) ~ H(2H — 1)k? 2., 
We note that 
foe) 
p(k) = œ% 
k=1 
for 5 <H<1. 
For H = 5 (a usual Brownian motion) the variables Y = (Yn) form Gaussian 
‘white noise’ with p(k) = 0, k > 1. 
Ou the other hand, if 0 < H < 55 then 


S-la(k)l< oo, S p(k) =0. 
k=1 k=1 


Remark. In Chapter 7 of the monograph [202] one can find a discussion of various 
models of processes with ‘strong aftereffect’ and much information about the appli- 
cations of these models in economics, biology, hydrology, and so on. See also [418]. 
4. Another model, HARCH(p), was introduced and analyzed in [360] and [89]. 
This is a model from the ARCH family in which the autocorrelation functions for 
the absolute values and the squares of the variables hn decrease slower than in 
the case of models of ARCH(p) or GARCH (p,q) kinds. The same ‘long memory’ 
phenomenon is characteristic of the FIGARCH models introduced in [15]. 

By definition, the HARCH (p) (Heterogeneous AutoRegressive Conditional Het- 
eroskedastic) model (of order p) is defined by the relation 


hn = On€n, 


where ; 
P J 2 
2 
ETES DIE ODL 
j=l i=l 


with ag > 0, œp > 0, and a; > 0, j =1,...,p—1. 
In particular, for p = 1 we have 


2 
On = &0 +Q hei 


ie., HARCH(1) = ARCH(1). 
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For p = 2, 
oe =agt+ aik? + œz(hn—-1 + Rn—2)?- (12) 
We now consider several properties of this model. 
First, we point out that the presence of the term (hn—1 + Wig 3)" enables the 
model to ‘capture’ the above-mentioned asymmetry effects. 
Further, if a9 + œ + ag < 1, then it follows from (12) that there exists a 
‘stationary’ value 


Eh? = Eo? = 2 __. 1 
i= Eon = a (13) 
In a sùnilar way, considering Eo4 and using the equalities 
Ehn—ihn—2 = Eh _1Rn—2 = Ehn_ih3_» = 0, 
we obtain by (12) that for (a1 + a2)? +03 < 2, the ‘stationary value’ 
C 
Ehf = , (14) 
” $- (a1 + a2)? -— o3 
is well defined, where 
2 2 
ag |l + 2a2(a1 + 302) — (&] + 2a 
c= 2l 2(@1 + 3a2) — (a1 + 2a2)] (15) 


[1 ~ (ay + 2a2)}? 


(We note that Eo4 = 4Eh4.) 
We now find the autocorrelation function for (h2). 
Let R(k) = Eh? h2 p Then in the ‘stationary’ case, for k = 1 we have 
202472 2,2 
R(1) = EopEnhn-1 = Eon hy-1 
= E(hž_ı [a0 + arh2—1 +az(hn-1 + hn—2)]) 
= œoEh2_; + a, Eh4_, + agEhs_, + 2a2Eh3_;hn—2 + aQEh?_jh?_». 
Consequently, if a2 < 1, then 


y= agEh2_, + (a1 + ag)Ehs_, (16) 
1- ag i 


Further, 
R(k) = Eben ay = Bee he, 
= Elag + arh? 1 + a2(hn—1 + hn-2) |h p 
= agEh2_, + (a1 + a2)R(k — 1) + agR(k — 2), 


where R(0) = Ehi. 
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Hence the autocorrelation function p(k) = Corr(h2, h?_,) clearly satisfies in the 
stationary case the equation 


p(k) = A+ Bo(k—1)+Cp(k-2), k > 2, (17) 
where 
As a Ehe, pa (at a2) (Ehz)? c- 1)(Ehn)? 
Dh2 Dh2 Dh2 
z R(1) — (Eh2)? 
p(0)=1, pa) = SU 


We continue our analysis of the ‘strong aftereffect’ in Chapter IV, § 3e, while 
discussing exchange rates, 


§ 3c. Stochastic Volatility Models 


1. A characteristic feature of these models, introduced in § 1d, is the existence of 
two sources of randomness, € = (En) and 6 = (ôn), governing the behavior of the 
sequence h = (hn) so that 

hn = OnEn, (1) 


where op, = ex dn and the sequence (An) is in the class AR(p), i.e., 
P 
An = a0 +Y ajAn—j + côn. (2) 
i=1 


We shall assume that £ = (€n) and ô = (ôn) are independent standard Gaussian 
sequences. Then we shall say that h = (hn) is governed by the SV (p) (Stochastic 
Volatility) model. 

We now consider the properties of this model in the case of p = 1 and |a| < 1. 
We have 

hn = On€n, In o2 =ag+a,ln pe + côn- (3) 


Let F = o(€1,...,€n361,---, Ön) and let FÊ = o(5,,...,dn). Clearly, 
E(hn | FÈ) = onEen =0 
and 
E(hn | F721) = Elonen | Fe) 
= E(onE(En | Fn21 V a(n) | Foor) 
= E(onE(En | Fe") =0 


because Eep = 0. 
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Hence the sequence h = (hn) is a martingale difference with respect to the 
flow (FE?) (Although not with respect to (F2) because the hn are not F2- 
measurable.) 

Further, 


2 2 2 2 An 
Eh), = Eo; Ee;, = Eo, = Ee™”. 
We shall assume that 
2 


ag c 
Apo ~ N : ; 4 
4 (2. a) 4) 


By (3), the sequence A = (A,,) fits into the autoregressive scheme AR(1) (i.e., 
An = a9 + a1An~-1 + Cdn) and is stationary (see § 2b). 


By (4), 


2 
ag — 
Eh? = Eeôr = eTa e?n, 


where we evaluate Ee4” using the fact that 
1-2 
Ee 27 =1 


for each g and a random variable € ~ (0,1). 
In a similar way, 


2 

[21 ee ers 

Elhn| = Elen| Eon = \/ = Ee24" = \/—e?0-a) e? 170, 
T T 


We now consider the covariance properties of the sequences h = (hn) and 
h? = (h2). 
We have 
Ehnhn41 = 0 


and, more generally, 
Ehnħhn4k = 0 


for each k > 1. Hence h = (hn) is a sequence of uncorrelated random variables: if 
Rp (k) = Ehnhn+k, then 


Eh2, k=0, 


Rose k > 0. 
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Further, 
Eh? n?_, =Eozo2_, =EeSnt4n-1 
= E(e4n-1 E(etota1An-1+05n | An—1)) 
2 
= Ee%tAn-1(1t+a1) E ecr Z et0t F Eell+a)An-1 


2 mtn oell+er)(An-1= rtt) 


E 
= et FZ e 1-a I-a] 
2 G+a)? 2 2 2 
2ag +% LA. 2ag pen cf lta 
=el 2e l-al = gl-a,;" 2 ẹ2 l-a] 
2a9 +6 2 2ag+c? 
= el-a, 2 l-ay — e l-ay 
Hence 
2ag+c? 2a, g 
ao L 12 
Cov(h2,h?_1) =e 17er — ela e174 


As might be expected, the variables h2 and h2_, are positively correlated for 
a, > 0 and negatively correlated for a, < 0. 
Besides the above formulas 


2 
Eo2 = ela e247) | 


2.2 209) Le 
Egin] =e'-2 gimi 


we present also more general ones. Namely, for positive constants r and s we have 


2 2 
SOs Ee Ls 
2(1- as 
Eo” =e (1-a1) 8 1 ay 


n f 

ce 

TS —ae) 

roos Sa t s 4(1-a%) 
Epon] = Eon Eonje Ia 


These formulas can be used in the calculations of various moments of the hy. 


For example, 
2 
2 l2 Garten ay 
Elkal zee = 4/2 e? FY ade Ur 
T T 


Eh’ = 3Eo4, 
Ehn hn = Eonon_ks 


2 
Elhnhn—zl a Ge EonOn—k- 
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In particular, these relations yield the following expression for the ‘stationary’ 


kurtosis: 
buf “Ele. oie (5% = re) a Doz 
(Eon)? et 
which shows that the ‘stochastic volatility’ models with two sources of randomness 
E = (En) and 6 = (n) enable one (in a similar way to the ARCH family) to generate 
sequences k = (hn) such that with the distribution densities for the hn have peaks 
around the mean value Eh, = 0 (the leptokurtosis phenomenon). 


2. We now dwell on the issue of constructing volatility estimators n on the basis 
of the observations hy,..., hn. 
If hn = u+ onén, then Ehn = p and 


2 
Elhn = pI = Elonén| = uf 2 Ern 


It is natural to make this relation the starting point in our construction of the 
estimators Gp, of the volatilities on by adopting the formula 


oe 4 In — To (5) 


with 


if wis unknown and the formula 


A [T 
On = > [An zs u! (6) 
if u is known. 


Another estimation method for the 2 is based on the fact that Eh? = Eo?, i.e., 
on the properties of second-order moments. 


Clearly, we could choose o2 = h? as an estimator for øŽ. This is a nonbiased 
estimator, but its mean squared error 


Elo2 — 02 |? = E|h? — 02|? = Eh4 — 2Eh202 + Eo! 


2a 2c? 
= 4Eg* — 2Eo4 = 2Eot = 2e 
On On On exp TETT + 1-a? 


can be fairly large. 

Of course, if the variables Ts k <n, are correlated, then we can use not only h 
2 2 
pee yanks 


2 


n’ 


but also the preceding observations h to construct estimators for o2. 
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It is clear that if the a, k < n, are weakly correlated, then the past variables ae 


must enter our formulas with small, decreasing weights. On the other 


2 
n—2°°° 


can be a source of important additional information about ae (compared with the 
information contained in A2). 
This idea brings one to the consideration of exponentially weighted estimators 


ie Sea, 
hand, if the o, k <n, are strongly correlated, then the variables he i; h 


= (I-A) SMW, 0S AKI, (7) 
k=0 


in which, as we see, —oo, rather than time 0, is the starting point. Making this 
assumption we obtain the following formula for the stationary solution of the au- 
toregressive recursion relation (2): 


foe} 
288 k 
An = IL rA + P (8) 


where the series converges in mean square. As shown in § 2b, this is the unique 
stationary solution. 


OO 
We note that (1—A) X AK = 1, i.e., the sum of the weighted coefficients involved 
k=0 


in the construction of of is equal to one. 

Since Eh = Eo? y = Eo2, it follows that Eo2 = Eo?, which means that the 
estimator o2 is unbiased (as is a ). 

We also point out that the quality of the estimator of considerably depends on 


the chosen value of the parameter A, so that there arises the (fairly complicated) 
problem of the ‘optimal’ choice of A. 


By (7) we see that the o? satisfy the recursion relations 


oz = ro2_, + (1— A)A?, (9) 


which is convenient in the construction of estimators by means of statistical analysis 
and model-building. 


3. In our considerations of the model 
1 
hn =e24"%en, (An = lnoŻ), (10) 
where An = ag + @1 An—1 + Côn, it might seem reasonable to use 


Mn = E(An | h1,- --, hn) 
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as an estimator for An. This estimator would be optimal in the mean square sense. 
Unfortunately, this is a nonlinear scheme, which makes the problem of finding an 
explicit formula for mn almost hopeless. The first idea coming naturally to mind 
is to ‘linearize’ this problem and, after that, to use the theory of ‘Gaussian linear 
filtering’ of R. Kalman and R. Bucy (see, e.g., [303]). 

For example, we can proceed as follows. 

By (10) we obtain 


In h2 = Ine? + Ino? = Elne? + An + (Ine? — Elnež), 


where Elne? ~ —1.27 and Dine? = 17/2 ~ 4.93. Hence setting £n = ln? hn we 
arrive at the following linear system: 


An = a9 + 41 An-1 + Côn, (11) 


va (12) 


Elne? + An + 


II 


In 


where 


En = Z me? - Eme?) (13) 
with Eg, = 0 and Dé, = 1, and we can regard the above quantity -1.27 as an 
approximation to E lIn e2. 

Thus, we can assume that we have a linear system (11)-(12), in which the 
distributions of the En are non-Gaussian. This rules out a direct use of the Kalman- 
Bucy linear filtering. 

Nevertheless, we shall consider the Kalman--Bucy filter as if the En, n > 1, were 
normally distributed variables, En ~ M (0,1), independent of the sequence (ôn). 

Let un = E(An |21,---, £n) and let yn = DA, under this assumption. Then 
the evolution of the yp and Yn is described by the following system (see, e.g., [303; 
Chapter VI. §7, Theorem 1]): 


Hn+1 = (a0 + alun) + a (ant + 1.27 — pn), (14) 
F + Yn 
ayn)? 

masla oe (15) 
T + Yn 


Here uo = EAg and yo = DAo. 

We note that if the parameter c in (11) is large, then An is the dominating term 
in the formula for zn. Hence one can expect the un to be ‘good’ approximations of 
the mn. 
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4. We point out that if the parameters 0 = (aọ,@1,¢) of the initial model (11) 
are unknown, then one often uses the Bayes method to find an estimator G,, for 


On, on the basis of (hi,...,hn). Its cornerstone is the assumption that we are 
given a prior distribution Law(@) of 6. Then, in principle, we can find, first, the 
posterior distribution Law(6,on|h1,.-.,hn) and then the posterior distributions 


Law(0 | h1, ..-, hn) and Law(op | hi,..., hn). This would enable us to construct the 
estimators Bn and Gn, e.g., as the posterior means or the values delivering maxima 
of the posterior densities. For a more comprehensive treatment of the Bayesian 
approach, see, for instance, [252] and the comments to this paper on pp. 395-417 of 
the same issue (v. 12, no. 4, 1994) of Journal of Business and Economic Statistics. 


5. We have described the models in the GARCH family and ‘stochastic volatility’ 
models in the framework of the conditional approach. The conditional distribution 
Law(hn | on) has always been a normal distribution, (0,02), with o2 dependent 
on the ‘past’ in a ‘predictable’ way. Taking this approach it is natural to ask about 
the unconditional distributions Law(h,) and Law(hj,..., hn). 

To give a notion of the results possible here we consider now the following model 
(see [105]). 

Let hn = On€n, where (€n) is again a standard Gaussian sequence, 


o2 = ao] + Wn, -0 < n < œ, (16) 


and (ôn) is the sequence of nonnegative independent stable random variables with 
exponent a, 0 <a@ < 1 (cf. Chapter III, § 1c.4). We assume that the sequences (En) 
and (ôn) are independent. 

If0 <a< 1, then we obtain recursively by (16) that 


oO 
2 k ; 2 
on =b 5 a" On—k + „im, a On—m-1- (17) 
k=0 
By the sclf-sirnilarity properties of stable distributions, 
oo oo 1/a 
Sak,» S ( y ai) di: 
k=0 k=0 


Consequently, if 0 < œ < 1 and 0 < a < 1, then (16) has a (finite) nonnegative 


‘stationary’ solution (až), where 
1 1/a 
(a) 61. (18) 


Hence we see from the definition (hn = onép) that the stationary one-dimen- 
sional distribution Law (An) is stable, with stability exponent 2a. 


on 


Wha 
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6. To complete this section, devoted to nonlinear stochastic models and their prop- 
erties, we dwell on the above-mentioned phenomenon of ‘heavy tails’ observable in 
these models. (See also Chapter IV, § 2c.) 

We cousider the ARCH(1)-model of variables h = (hn)n>o with initial con- 
dition hg independent on the standard Gaussian sequence € = (€n)n>1, In = 


Jao tarh?_, En for n > 1, a9 > 0, and 0 < aq; < 1. 


For ah appropriate choice of the distribution of ho this model turns out to have 
a solution k = (hn)n>0 that is a strictly stationary process with phenomenon of 
‘heavy tails’ (observable for sufficiently small a] > 0): P(hn > z) ~ cz~7, where 
c>Oandy>0. 


The corresponding (fairly tricky) proof can be found in the recent monograph of 
P. Embrechts, C. Klueppelberg, T. Mikosch “Modelling extremal events for insur- 
ance and finance”, Berlin, Springer-Verlag, 1997 (Theorems 8.4.9 and 8.4.12). One 
can also find there a thorough analysis of many models of ARCH,GARCH and 
related kinds and a large list of literature devoted to these models. 


4. Supplement: Dynamical Chaos Models 


§4a. Nonlinear Chaotic Models 


1. So far, in our descriptions of the evolution of the sequences h = (hn) with 


hyn = In Sn T where Sn is the level of some ‘price’ at time n, we were based on the 
ie 
gönjectire that these variables were stochastic, i.e., the Sn = Sn(w) and An = hn(w) 
were random variables defined on some filtered probability space (Q. F, (Fn)nz1, P) 
aud siinulating the statistical uncertainty of ‘real-life’ situations. 
On the other hand, it is well known that even very simple noulinear deterministic 
systems of the type 


In+1 = f(n; À) (1) 


or 


LyH+1 = f (En, n—1; -Enki À), (2) 


where À is a parameter, cau produce (for appropriate initial conditions) sequences 
with behavior vary similar to that of stochastic sequences. 

This justifies the following question: is it likely that many economic, including 
financial, series are actually realizations of chaotic (rather than stochastic) systems, 
i.e., systems described by deterministic nonlinear systems? It is known that such 
systems can bring about phenomena (e.g., the ‘cluster property’) observable in the 
statistical analysis of financial data (see Chapter IV). 

Referring to a rather extensive special literature for the formal definitions (see, 
e.g., [59], [71], [104], [198], [378]. [379], [383],[385], [386], [428], or [456}), we now 
present several examples of nonlinear chaotic systems in order to provide the reader 
with a notion of their behavior. We shall also consider the natural question as to 
how one can guess the kind of the system (stochastic or chaotic) that has generated 
a particular realization. 
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Discussing the forecasts of the future price movements, the predictability prob- 
lem is also of considerable interest in nonlinear chaotic models. As we shall see 
below, the situation here does not inspire one with much optimism, because, all the 
determinism notwithstanding, the behavior of the trajectories of chaotic systems 
can considerably vary after a small change in the initial data and the value of A. 


2. EXAMPLE 1. We consider the so-called logistic map 
z~ Tr = Ax(1- 2) 
and the corresponding (one-dimensional) dynamical system 
Tn = ATn-1(1 — Ln-1), n>1, O0<29 <1. (3) 


(Apparently, logistic equations (3) occurred first in the models of population dy- 
namics that imposed constraints on the growth of a population.) 


+ 


+ — tt 
0 10 20 30 40 50 60 70 80 90 100 
FIGURE 23a. Case A= 1 


For À < 1 the solutions £n = £n(À) converge monotonically to 0 as n — co for 
all 0 < zg < 1 (Fig. 23a). Thus, the state too = 0 is the unique stable state in this 
case, aud it is the limit point of the £n as n > œœ. 


For à = 2 we have zp, t 4 (Fig. 23b). Hence there also exists in this case a 


unique stable state (Loo = 3) attracting the £n as n > œ. 


We now consider larger values of A. For A < 3 the system (3) still has a unique 
stable state. However, an entirely new phenomenon occurs for A = 3: as n grows, 
one can distinguish two states Zoo (Fig. 23c), and the system alternates between 
these states. : 

This pattern is retained as A increases, until something new happens for A = 
3.4494...: the system has now four distinguished states x. and leaps from one to 
another (Fig. 23d). 
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+ + + + 


0 10 20 30 40 50 60 70 80 90 100 


+ + + + 


FIGURE 23b. Case A = 2 


0.44 


0.3 t 
0 10 20 30 40 50 60 70 80 90 100 


FIGURE 23c. Case \= 3 


New distinguished states come into being with further increases in A: there are 
8 such states for A = 3.5360..., 16 for A = 3.5644..., and so on. For A = 3.6, there 
exists infinitely many such states, which is usually interpreted as a loss of stability 
and a transition into a chaotic state. 

Now the periodic character of the movements between different states is com- 
pletely lost; the system wanders over an infinite set of states jumping from one to 
another. It should be pointed out that, although our system is deterministic, it is 
impossible in practice to predict its position at some later time because the limited 
precision in our knowledge of the values of the tn and X can considerably influence 
the results. 

It is clear from this brief description already that the values (Ax) of A at which 
the system ‘branches’, ‘bifurcates’ draw closer together in the process (Fig. 24). 

As conjectured by M. Feigenbaum and proved by O. Lanford [294], for all par- 
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FIGURE 23d. Case ÀA = 3.5 


o 
ae 
o 
— 
Om 


——+ + + + + + 


0 10 20 30 40 50 60 70 80 90 100 
FIGURE 23e. Case \ = 4 


abolic systems we have 


Àk — Àk- 
se F, k> ow, 
Àk+1 — Ak 
where F = 4.669201... is a universal constant (the Feigenbaum constant). 
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The value À = 4 is of particular importance for (3): it is for this value of the 


180 Chapter II. Stochastic Models. Discrete Time 


) 

| 

j 

+t 4 

3 3.4494 3.5360 


FIGURE 24. Doubling process for the states too in the logistic 
system as À ¢ 4 


parameter that the sequence of observations (zn) of our (chaotic) system is similar 
to a realization of a stochastic sequence of ‘white noise’ type. 

Indeed, let tg = 0.1. We now calculate recursively the values of z1, £2,..., £1000 
using (3). The (empirical) mean value and the standard deviation evaluated on the 
basis of these 1000 numbers are 0.48887 and 0.35742, respectively (up to the fifth 
digit). 


TABLE 2. 
| 1 | -0.033 | 11 | —0.046 —0.008 | 31 | 0.038 
2 | —0.058 12] 0.002 | 22] 0.009 |] 32 | —0.017 
3 | —0.025 E —0.011 | 23 | —0.039 || 33 | 0.014 
| 4 | —0.035 | 14 | 0.040 | 24 | —0.020 | 34| 0.001 
5 | -0.012 | 15 | 0.014 | 25 | —0.008 | 35| 0.017 
6 | -0.032 | 16 | —0.023 | 26 | 0.017 | 36 | —0.052 
| 7 | —0.048 || 17 | —0.030 |27| 0.006 | 37 | 0.004 
s| 0.027 |18| 0.037 | 28 | -0.004 | 38 | 0.053 
9 | -0.020 | 19| 0.078 | 29 | -0.019 | 39 | -0.021 
10 | -0.013 || 20 | 0.017 | 30 | —0.076 || 40 | 0.007 


In Table 2 we present the values of the (empirical) correlation function p(k) 
calculated from x9, 2%1,..-,21000- It is clearly visible from this table that the values 
Tn of the logistic map with À = 4 can in practice be assumed to be uncorrelated. 
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In this sense, the sequence (zn) can be called ‘chaotic white noise’. 

It is worth noting that the system tp = 4¢,_1(1—rty_1), n > 1, with zo E (0,1) 
has an invariant distribution P (which means that P satisfies the equality P(T~! A) 
= P(A) for each Borel subset A of (0,1)) with density 


1 


Wei ay’ z € (0,1). (4) 


p(z) = 


Thus, assuming that the initial value zg is a random variable with density 
p = p(z) of the probability distribution, we can see that the random variables £n, 
n > 1, have the same distribution as zg. We point out that all the ‘randomness’ 
of the resulting stochastic dynamic system (zn) is related to the random initial 
value xg, while the dynamics of the transitions £n —> tn+41 is deterministic and 
described by (3). 

If (4) holds, then it is easy to see that Erg = j, Er? = 3, and Dao = Z = 
(0.35355 ...)?. (Cf. the values 0.48887 and 0.35742 presented above.) As regards 


the correlation function 
_ Exor, — ExoEz, 


J/Dxo Dz, 


1 ifk=0 
0 ifk 40° 


p(k) 


we have 
p(k) = { 
EXAMPLE 2 (the Bernoulli transformation). We set 
Tn = 2tn-1 (mod 1), zo € (0,1). 
Here the uniform distribution with density p(z) = 1, x € (0,1), is invariant, and 


we have Exg = 5, Eze = $, Dro = ib. and p(k) = 27}, k =1,2,.... 
EXAMPLE 3 (the ‘tent’ map). We set 


In=1- [1 z 2zn-1\, T0 E (0, 1). 


Here, as in Example 2, the uniform distribution on (0,1) is invariant. Hence 
Ero = $, Er? = $, Dro = b and p(k) = 0 for k 4 0. 
EXAMPLE 4. Let 


#212 2/enal zo € (-1,1). 


Then the distribution with density p(z) = (1 — z)/2 on (—1,1) is invariant. For 
this distribution we have Erp = =}, Ex? = z, and Dag = 2. 
The behavior of the sequences (£n)ngu for zo = 0.2 and N = 100 or N = 1000 


is depicted in Fig. 25a,b. 
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FIGURE 25a. Graph of the sequence z = (fn)n>0 with £n 
Zo = 0.2 for N = 100 
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FIGURE 25b. Graph of the sequence z = (fn)nzo with £n = 1 — 2y |£n-1| and 


zo = 0.2 for N = 1000 


3. The above examples of nonlinear dynamical systems are of interest from vari- 
ous viewpoints, First, considering the example of, say, the logistic system, which 
develops in accordance with a ‘binary’ pattern, one can get a clear idea of frac- 
tality discussed in Chapter III, §2. Second, the behavior of such ‘chaotic’ systems 
suggests one to use them in the construction of models simulating the evolution of 
financial indexes, in particular, in times of crashes, which are featured by ‘chaotic’ 


(rather than ‘stochastic’) behavior. 
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§ 4b. Distinguishing between ‘Chaotic’ 
and ‘Stochastic’ Sequences 


1. The fact that purely deterministic dynamical systems can have properties of 
‘stochastic white noise’ is not unexpected. It has been fairly long known, although 
this still turns out to be surprising for many. The more interesting, for that reason, 
are the questions on how one can draw a line between ‘stochastic’ and ‘chaotic’ 
sequences, whether it is possible in principle, and whether the true nature of ‘ir- 
regularities’ in the financial data is ‘stochastic’ or ‘chaotic’, (Presumably, an ap- 
propriate approach here could be based on the concept of ‘complexity in the sense 
of A. N. Kolmogorov, P. Martin-Löf, and V. A. Uspenskii’ adapted for particular 
realizations. ) 

Below we discuss the approach taken in [305] and [448]. In it, the central role 
in distinguishing between ‘chaotic’ and ‘stochastic’ is assigned to the function 


ae (1) 


where, given a sequence (£n), the function ~(N,¢) is the number of pairs (7,7), 
1,7 S N, such that 
lags) ee, 


Besides C(e), we shall also consider the functions 


where %m(N,¢€) is the number of (i,j), if < N, such that all the differences 

between the corresponding components of the vectors (2j,2341,...,;%it+m—i) and 

(2j, 2j4+15+++)Lj+m-—1) are at most £, (For m = 1 we have yı (N,¢€) = P(N, ¢).) 
For stochastic sequences (£n) of ‘white noise’ type we have 


Cm(e) ~ 6°” (2) 


for small €, where the fractal exponent vm is equal to m. Many deterministic systems 
also have property (2) (e.g., the logistic system (3) in the preceding section [305]). 
The exponent vm is also called the correlation dimension and is closely connected 
with the Hausdorff dimension and Kolmogorov’s information dimension. 

Distinguishing between ‘chaotic’ and ‘stochastic’ sequences in [305] and [448] 
is based on the following observation: these sequences have different correlation 
dimensions. As seen from what follows, this dimension is larger for ‘stochastic’ 
sequences than for ‘chaotic’ ones. 
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By [448] and [305]. the quantities 


InCm(é;) — ln Cm(Ej+1) 


Veg = 
ees Iné; = Iné544 
and 
k 
1 
= Ls 
where €; = yl and 0 < y < 1, can be taken as estimators of the correlation 


dimension vm- 

In Table 3a one can find the values of the m, j corresponding to p = 0.9, 
m = 1,2,3,4,5,10, and several values of 7 in the case of the logistic sequence 
(&n)ngn.» Where N = 5900 and £; = pÏ (these data are borrowed from [305]) 


TABLE 3a. Values of Um; for the logistic system 


51059 [0.95] 0. 
0350.90) 094[o9T| 098 1.01 
0 [0.01 [097 099). 


We now compare the data in this table with the estimates for v,,,; obtained by 
a simulation of Gaussian white noise with parameters characteristic of the logistic 
map (3) in § 6a (Table 3b: the data from [305]): 


TABLE 3b. Valucs of the mj for Gaussian white noise 


pee] a] 2]3]4.s5 lio 
20 | 0.84] 1.68 | 2.52] 3.35 | 4.201 8.43 
30 [0.98] 1.97|2.95] 3.98 | 4.98] - 

35 | 0.99] 1.97 | 2.93] 4.00 5.53 | 3 

40 [1.00 2.02 | 3.03 | 4.15 5.38 | 3 


Comparing these tables we see that it is fairly difficult to distinguish between 
‘chaotic’ and ‘stochastic’ cases on the basis of the correlation dimension 0; ; (corre- 
sponding to m = 1). However, if m is larger, then a considerable difference between 
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the values of the m; for these two cases is apparent. This is a rather solid argu- 
ment in favor of the conjecture on distinct natures of the corresponding sequences 
(xn), although there is virtually no difference between their empirical mean values, 
variances, or correlations. 


2. To illustrate the problems of distinguishing between ‘stochastic’ and ‘chaotic’ 
cases in the case of financial series we present the tables of the correlation di- 


S 
mensions calculated for the daily values of the ‘returns’ hn = n ——, n > 1, 
n-1 
corresponding to IBM stock price and S&P500 Index (Tables 4a, and 4b are com- 
piled from 5903 observations carried out between June 2, 1962 and December 31, 


1985; the data are borrowed from [305]): 


TABLE 4a. Values of Ùm, j for IBM stock 


oH 
fat 
© 


2.05 | 3.63 


(1.76 |2.61 


1.93 | 2.88 | 3.82 


0.93 | 1.82 | 2.07 | 3.49 
| 35 | 0.98 | 1.94} 2.88 | 3.79 | 4.75 | 11.00 
| 40 |0.99|1.98|2.92|3.84]4.81| - 


Comparing these tables we see, first, that the estimates of the ‘correlation di- 
mension’ of the IBM and the S&P500 returns in this two tables are very close. 
Second, comparing the data in Tables 4a,b and Tables 3a,b we see that the se- 


S 
quences (hn) corresponding to these two indexes (where hn = ln 3 onl) 


are closer to ‘stochastic white noise’. Of course, this cannot disprove the conjec- 
ture that other ‘chaotic sequences’, of larger correlation dimensions, can also have 
similar properties. (For greater detail on the issue of distinguishing and for an 
economist’s commentary, see [305].) 
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3. We now consider briefly another approach to discovering distinctions between 
‘chaotic’ and ‘stochastic’ cases, which was suggested in [17]. 

Let x = (Tn) be a ‘chaotic’ sequence that is a realization of some dynamical 
system in which zg is a random variable with probability distribution F = F(z) 
invariant with respect to this system. 

Next, let = (Zn) be a ‘stochastic’ sequence of independent identically dis- 
tributed variables with (one-dimensional) distribution F = F(z). 

We consider now the variables 


My = max(z0,71,---,Zn) and M, = max(Z0,71,---,2n)- 


Let Fy(x) = P(Mn < x) and let Fn(t) = P(Mn < 2). 

The approach in [17] is based on the observation that the maximum is a char- 
acteristic well suited for capturing the difference between ‘stochastic’ and ‘chaotic’ 
sequences. 

To substantiate this approach, the authors of [17] proceed as follows. 

In the theory of limit theorems for extremum values there are_well-known nec- 
essary and sufficient conditions ensuring that the variables an(Mn — bn), where 
an > 0 and bn are some constants and n > 1, have a (nontrivial) limit distribution 


lim P (an (Mn — bn) < 2) = G(z) 
Referring to [187], [206], [156], [124], and [137] for details, we now present several 


examples. 
If F(z) =1—a2°?, x > 1, and p> 0, then 


If F(z) = 1—(—2)?, -1 < z <0, and p> 0, then (for z < 0) 
P (n/P Mn < z) > exp(—|z|?). 
If F(z) =1-—e7*, z > 0, then 


P(M, — logn < z) = exp(—e7*) 


for z ER. 
If F(x) = (z) is a standard normal distribution, then 
P((2logn)!/? (Mn — bn) <2) > (-e*), zER 


> 


where we choose the bn such that P(to > bn) = 5 (In this case bn ~ (2log n)!/2.) 
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For the Bernoulli transformation (Example 2 in §4a) we have the invariant 
distribution F(x) = x, z € (0,1). Setting an = n and bn = 1 — n7! we obtain the 
limit distribution _ 

G(x) = exp(x — 1), <1. 


In Example 4 from §4a we have F(z) = 1 — p*(r), where p(x) = (1 — 2)/2, 
z € (—1,1), therefore setting an = yn and bn = 1 — 2/./n we see that 


G(x) = exo(-( = 1)'). 


ae 2 : 
For Example 1 in §4a the invariant distribution is F(z) = — arcsin ys 
T 


(Example 1 in § 4a), and after the corresponding renormalization we obtain 


G(x) = exp(-(1 — 27)!/2). 


Given the distributions F(z) = (F(x))” and the limit distribution G(z), it 
would be reasonable to compare them with the corresponding distributions Fn (z) 
and, if possible, with their limits, say, G(z). However, as pointed out in [17], there 
exists a serious technical problem, because there are no analytic expressions for the 
F(z) in the examples in § 4a that are convenient for further analysis. For that rea- 
son, the approach taken in [17] consists in the numerical analysis of the distributions 
F(x) for large n and their comparison with the corresponding distributions F,,(2). 

For the dynamical systems in § 4a this analysis shows that, globally, the be- 
havior of the F,,(x) (for chaotic systems with invariant distribution F(x)) has a 
character distinct from the behavior of the F,,(2) (for stochastic systems of inde- 
pendent identically distributed variables with one-dimensional distribution F(z)). 
This indicates that, for the models under consideration, the maximum value is a 
‘good’ statistics for our problem of distinguishing between chaotic and stochastic 
cases. But of course, this does not rule out the possibility that there exists a chaotic 
System Zn41 = f(n, En-1::-- n-k; A) with k sufficiently large that is difficult to 
distinguish from stochastic white noise on the basis of a large (but finite) number 
of observations. 
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1. Non-Gaussian Models of Distributions and Processes 


§1a. Stable and Infinitely Divisible Distributions 


1. Inthe next chapter we discuss the results of the statistical analysis of prospective 
models of distribution and evolution for such financial indexes as currency exchange 
rates, share prices, and so on. This analysis will demonstrate the importance of 
stable distributions and stable processes as natural and fairly likely candidates in 
the construction of probabilistic models of this kind. 

For that reason, we provide the reader in this section with necessary information 
on both these distributions and processes and on more general, infininely divisible 
ones: without the latter our description and the discussion of the properties of 
financial indexes will lack of completeness. 

In our exposition of basic concepts and properties of stability and inifinite divis- 
ibility we shall (to a certain extent) follow the chronological order. First, we shall 
discuss one-dimensional stable distributions considered in the 1920s by P. Lévy, 
G. Pélya, A. Ya. Khintchine, and then proceed to one and several-dimensional in- 
finitely divisible distributions studied by B. de Finetti, A. N. Kolmogorov, P. Lévy, 
and A. Ya. Khintchine in the 1930s. After that, in §1b we introduce the reader to 
the main concepts and the properties of Lévy processes and stable processes. 

The monographs [156], [188], [418], and [484] belong among widely used text- 
books on stable and infinitely divisible distributions and processes. 


2. DEFINITION 1. We say that a nondegenerate random variable X is stable or 
has a stable distribution if for any positive numbers a and b there exists a positive 
c and a number d such that 


Law(aX1 + bX2) = Law(cX + d) (1) 


for independent random copies X; and X2 of X (this means that Law(X;) = 
Law(X), i = 1,2; we shall assume without loss of generality that all the random 
variables under consideration are defined on the same probability space (Q, F, P)). 


190 Chapter III. Stochastic Models. Continuous Time 


It can be proved (see the above-mentioned monographs) that in (1) we neces- 
sarily have 
c| = a” + 6 (2) 
for some @ € (0, 2] independent of a and b. 
One often uses another, equivalent, definition. 


DEFINITION 2. A random variable X is said to be stable if for each n > 2 there 
exist a positive number Cn and a number Dp such that 


Law(X; + X2 +---+ Xn) = Law(C,X + Dn), (3) 


where X1, X9,..., Xn are independent copies of X. 
If Dy, = 0 in (3) for n > 2, ie., 


Law(X] + X2 +--+ Xn) = Law(C,X), (4) 


then X is said to be a strictly stable variable. 


Remarkably, 
in (3) and (4) for some a, 0 < æ < 2, which is, of course, the same parameter as 
in (2). 


To emphasize the role and importance of œ, one often uses the term ‘a- stability’ 
in place of ‘stability’. 

For completeness, it is reasonable to add a third definition to the above two. It 
brings forward the role of stable distributions as the only ones that can be the limit 
distributions for (suitably normalized and centered) sums of independent identically 
distributed random variables. 


DEFINITION 3. We say that a random variable X has a stable distribution (or 
simply, is stable) if the distribution of X has a domain of attraction in the fol- 
lowing sense: there exist sequences of independent identically distributed random 
variables (Yn), positive numbers (dn), and real numbers (an) such that 


Yit t Yn 
dn 


+ an 4X as n > 20; (5) 


d . ee: . h 
here “—>” means convergence in distribution, i.e., 


(A= 
Law | —————_—— 
dn 


in the sense of weak convergence of the corresponding measures. 


+ an) — Law(X) 


This definition is equivalent to the above two because a random variable X 
Yi +-+ Yn 
dn 
quence (Yn) of independent identically distributed random variables if and only if 
X is stable (in the sense of Definition 1 or Definition 2; see the proof in [188] 

or [439; Chapter II, §5]). 


can be the limit in distribution of the variables + an for some se- 
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3. According to a remarkable result of probability theory (P. Lévy, A. Ya. Khint- 
chine) the characteristic function 


(0) = Eei®X 


of a stable random variable X has the following representation: 


exp ind ~ 60 |@ (1 — iß(Sgn 6) tan =\} ifa #1, 
er exp ind — alol(1 + iBŽ (Send) In jal) } ifa=1, 2 


where 0 < a < 2, |8| < 1, o > 0, and u E R. 
Here the four parameters (a, 8, ø, 4) have the following meaning: 


a (which is, of course, the same as in (2) and (4)) is the stability exponent or 
the characteristic parameter; 

B is the skewness parameter of the distribution density; 

o is the scale parameter; 

p is the location parameter. 


The parameter œ ‘controls’ the decrease of the ‘tails’ of distributions. If 
0<a< 2, then 


+p 
. Q a 
jlim x P(X > z) = Ca z 8 3 (7) 
, = 
a STE a 
jim 2°P(X < -2) = Camo, (8) 
where EER 
oo —1 = TQ ? a # 1, 
Ca = (| sina de Sy maa (9) 
0 2 
= a=] 
T 
If œ = 2, than by (6) we obtain 
(0) = etH0-070 _ pind F (20%) (10) 


which shows that y(@) is the characteristic function of the normal distribution 
AN (m, 207), i.e., of a normally distributed random variable X such that 


EX =p and DX = 207, 


The value of 8 in (6) is not well defined in this case (because if a = 2, then this 
parameter enters the term tann, which is equal to zero). One usually sets 8 = 0. 
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Clearly, as regards the behavior of the tails of distributions, the case œ = 
significantly differs from a < 2, For instance. if = 0 and 202 = 1, then 


2 e7z’°/2 
P(|X| > 2) ~4/- as => OO, (11 
T 


T 


Comparing this with (7) and (8) we see that for a < 2 the tails are ‘heavier’: th 
decrease in the normal case is faster. (This is a right place to point out that, a 
seen in statistical analysis, for many series of financial indexes S = (Sn)n>o th 


n 


distributions of the ‘returns’ hn = In have ‘heavy tails’. It is natural, for thi 


reason, to try the class of stable dioui as a source of probabilistic model 
governing the sequences h = (hn).) 

It important to note that, as seen from (7) and (8), the expectation E|X| i 
finite if and only if œ > 1, In general, E|X |? < œ if and only if p < a. 

In connection with asymptotic formulas (7) and (8) the Pareto distribution i 
worth recalling. Its distribution density is 


Laer 
fa blz) = l zetl T20, (12 
0, z <b, 


with œ > 0 and b > 0, therefore its distribution function Fg ,(x) satisfies the relatio 


1— Fy4(z) = (2)*, z>b. (13 


Comparing with (7) and (8) we see that the behavior at infinity of stable distri 
butions is similar to that of the Pareto distribution. One can say that the tails c 
stable distributions fall within the Pareto type. 

The skewness parameter 3 € [—1,1] in (6) characterizes the asymmetry of . 
distribution. If @ = 0, then the distribution is symmetric. If 8 > 0, then th 
distribution is skewed on the left, and the closer 8 approaches one, the greater thi 
skewness. The case of 8 < 0 corresponds to a right skewness. 

The parameter ø plays the role of a scale coefficient. For a normal distributio: 
(a = 2) we have DX = 20? (note that the variance here is 207, not g? as in th 
standard notation). On the other hand, if œ < 2, then DX is not defined. 

We call u the location parameter, because for œ > 1 we have u = EX (so tha 
E|X| < co). There is no such interpretation in the general case for the mere reasoi 
that EX does not necessarily exist. 


4. Following an established tradition, we shall denote the stable distribution wit] 
parameters a, 6, o, and p by 


Sala, B, u); 
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we shall write X ~ S,(o,(,) to indicate that X has a stable distribution with 
parameters a, 3, o, and p. 

Note that Sa(0, B, u) is a symmetric distribution if and only if 8 = = 0. (It is 
obvious from the formula for the characteristic function that the constant Dp in (3) 
is equal to zero in that case.) This distribution is symmetric relative to u (which 
can be arbitrary) if and only if 8 = 0. 

In the syinmetric case (8 = u = 0) one often uses the notation 


X ~ Sas. 
In this case the characteristic function is 
(8) = e77", (14) 


5. Unfortunately, explicit formulas for the densities of stable distributions are 
known only for some values of the parameters. Among these distributions are 
the normal distribution S9(o, 0, p) = M (u, 207) with density 


1 _(z-u)? 


e rA (15) 


the Cauchy distribution Sı(0,0, p) with density 


oO 


Eao) va 


the one-sided stable distribution (also called the Lévy or Smirnov distribution) 
S1/2(7, 1, p) with exponent a = 1/2 on (u, o0) with density 


(a (x SUL exo( NE =z): (17) 


We point out two interesting and useful particular cases of (16) and (17): 
if X ~ S1(a,0,0), then 


1 1 A 
P(X <a) = 5 | E arctan = (18) 


Jor x > 0; 
if X~ S1/2(0, 1,0), then 


P(X <a) =2(1-0( 2 \) (19) 


forz>0. 
As regards the representation of the densities of stable distributions by series, 
see [156], [225], [418], aud [484]. 
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6. We now assume that X1, X2,..., Xn are independent random variables such 
that 
Xi ~ Salis Bis Hi), i= 1,2,...,n. 


Although, in general, they are not identically distributed, the fact that they all 
have the same stability exponent œ indicates (see formula (6) for the characteristic 
function) that their sum X = X1 +---+ Xn has a distribution of the same type, 
Salo, B, 4), with parameters 


a2 prot Bnn 
oy an” 


7. We now proceed to the more general class of so-called ‘infinitely divisible’ dis- 
tributions, which includes stable distributions, 


DEFINITION 4. We say that a random variable X and its probability distribution 
are infinitely divisible if for each n > 1 there exist independent identically distributed 


random variables Xp1,...,Xnn such that X g Xni ++ Xnn. 


The practical importance of this class is based on the following property: these 
and only these distributions can be the limits of the distributions of the sums 


N 
( X Xnr) of variables in the scheme of series 
k=1 
Xu 
X21, X22 
ae (20) 
Xpn1,Xn2,---,Xnn 


every row of which consists of independent identically distributed random variables 
Xn1,Xn2,---;Xnn. Note that no connection between the variables in distinct rows 
of (20) is assumed. (For greater detail, see [188] or [439; Chapter III, §5].) 

A more restricted class of stable distributions corresponds to the case when the 
variables X,,, in (20) can be constructed by means of a special procedure from 
the variables in a fixed sequence of independent identically distributed variables 
Y1, Yo,... (see the end of subsection 2): 


Xn = E + Z Loe ay. nL (21) 
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(We point out that the class of infinitely divisible distributions includes also 
the hyperbolic and the Gaussian\\inverse Gaussian distributions discussed below, 
in § 1d.) 

Definition 4 relates to the scalar case (X € R). It can be immediately extended 
to the vector-valued case (X € R*) with no significant modifications. 

Let P = P(dz) be the probability distribution of an infinitely divisible random 
vector X € R? and let 


(0) = Ee) = | ell) Pdr) 
Rd 


be its characteristic function; let (6,2) be the scalar product of two vectors 0 = 
(81,..., 0a) and x = (z1,...,2q). 
The work of B. de Finetti, followed by A. N. Kolmogorov (for E|X|? < œ) 


and finally P. Lévy and A. Ya. Khintchine in the 1930s resulted in the following 
Lévy-Khintchine formula for the characteristic function of X € R4: 


(0) = exp il B)- 3(0,C8) + f (e2) — 1 ~i(,2)I(\z| < 1)) vaa) }, (22) 


where B € R?, C = C(d x d) is a symmetric nonnegative definite matrix, and 
v =v(dz) is a Lévy measure: a positive measure in R? such that v({0}) = 0 and 


{ (\z|? A1) v(dz) < œ. (23) 
Rd 


(Note that both cases v(R?) < co and v(R?) = œ are possible here.) 
It should be pointed out that y(@) is specified by the three characteristics B, C, 
and v, and the triplet (B,C,v) in (22) is unambigously defined. 


EXAMPLES. 
1. If X is a degenerate random variable (P(X = a) = 1), then B = a, C = 0, 
v = 0, and 


2. If X is a randoin variable having the Poisson distribution with parameter À, 
then v(dz) = Aj} (dz) is a measure concentrated at the point z = 1, B = A, and 


(0) = gre =1). 


3. If X ~ .V(m,o?), then B= m, C = 07, v = 0, and 


(6) = eimo- 56? 
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4. If X is a random variable having the Cauchy distribution with density (16), 
then 
y(0) = eihO—ol6|_ 


5. If X is a random variable with density (17) (a one-sided stable distribution 
with exponent œ = 1/2), then 


(0) = ei0—o|6|?/2(1—-iSgn 8) 


8. A representation of the characteristic function y(@) by formula (22) (using the 
traditional, ‘canonical’ truncation function h(x) = x I(|z| < 1)) is not unique. For 
instance, in place of I(|z| < 1), we could use a representation with I(|z| < a), 
where a > 0. Of course, the corresponding triplet of characteristics would also 
be different. Notably, C and v do not change in that case; these are ‘intrinsic’ 
characteristics independent of our choice of the truncation function. Only the first 
characteristic, B, actually changes. 
For precise statements, we shall need the following definition. 


DEFINITION 5. We call a bounded function h = h(s), z € R?, with compact 
support satisfying the equality h(x) = x in a neighborhood of the origin a truncation 
function. 


Besides (22), the characteristic function y(@) has the following representations 
(valid for an arbitrary truncation function h = h(z)): 


40) = exp} i0, BH) — (0,00) + | (ei) 1- (0, 4(2))) ae), (24) 


where C aud v are independent of k and are the same as in (22), while the value 
of B(h) varies with h in accordance with the following equality: 


B(h) — B(h') = Loe — h'(x)) v(dz). (25) 


We point out that the integrals on the right-hand sides of (22) and (24) are well 
defined in view of (23), because the function 


CHO) 1 i(8, h(x)) 


is bounded and it is O(|z|?) as |x| > 0. 
If we replace (23) by the stronger inequality 


f ^ 1) v(dz) < œ, (26) 
R 
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then we can set h(x) = 0 in (24), so that 
(8) = exp {il B(0)) - (0,08) + Lee = Ai) rtd). (27) 


The constant B(0) in this representation is called the drift component of the 
random variable X. 
On the other hand, if we replace (23) by the stronger inequality 


f (|z? A |z|) v(dz) < œ, (28) 


Rd 


then (24) holds for h(x) = z, i.e., 


p(0) = exp ito, B) = 50, co+ | (i02) 1 - i(0,2)) vn). (29) 


Rd 


In this case the parameter B (the center) is in fact the expectation B=EX. 
We note that the condition E|X| < 00 is equivalent to the inequality 


J |z| v(dz) < œ. 
|jz|>1 


9. We have already mentioned that there are only three cases when stable distri- 
butions are explicitly known. These are (see subsection 5): 


normal distribution (a = 2), 
Cauchy distribution (a = 1), 
Lévy-Smirnov distribution (a = 1/2). 
The class of infinitely divisible distribution is much broader; it covers, in addi- 
tion, the following distributions (although this is not always easy to prove): 
Poisson distribution, 
gamma-distribution, 
geometric distribution, 
negative binomial distribution, 
t-distribution (Student distribution), 
F -distribution (Fisher distribution), 
log-normal distribution 
logistic distribution, 
Pareto distribution, 
two-sided exponential (Laplace) distribution, 
hyperbolic distribution, 
Gaussian\\inverse Gaussian distribution ... 
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TABLE 1 
(discrete distributions) 
Distribution Probabilities pk Parameters 
—Ayk 
eX 
Poisson Te k-—0,1,... A>O 
f = O<p<l 
Geometric pg* r, = OVAR met 
q=1-p 
. -1 r k- 0 <1 
Negative Cy, Pq ue A S ba 
binomial k=r,r+1,... a a 
CEpk gh—* O<pé< 1, 
Binomial q=1-p, 
k=0,1,...,n n= 1,2, 


However, many well-known distributions are not infinitely divisible. These in- 
clude: binomial and uniform distributions, each nondegenerate distribution with 
finite support, distributions with densities f(x) = Cel#l°, where a > 2. 

Some of these distributions are discrete, while other have distribution densities. 
For a complete picture and the convenience of references we list these distributions 
explicitly in Tables 1 and 2. 


10. The notion of ‘stable’ random variable can be naturally extended to the vector- 
valued case (cf. Definitions 1 and 2). 


DEFINITION 6. We call a random vector 
X= (X1, X2,.. -; Xa) 
a stable random vector in R? or a vector with stable d-dimensional distribution if for 


each pair of positive numbers A, B there exists a positive number C' and a vector 
D € R¢ such that 


Law(AX) + BX®)) = Law(CX + D), (30) 


where X(!) and X(2) are independent copies of X. 


It can be shown (see, e.g., [418; p. 58]) that a nondegenerate random vector 
X = (X1, Xo,...,Xq) is stable if and only if for each n > 2 there exist a € (0, 2] 
and a vector Dn such that 


Law(X@) 4 X 4... 4X) = Law(n/*X + Dn), 31) 


where X@), x), _X(”) are independent copies of X. 
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TABLE 2 


(distributions with densities) 


Distribution Density p = p(x) Parameters 
: 1 
Uniform on [a,b] boa? STS? a,bER, a<b 
2 
Normal 1 aE 
Qo R R 
(Gaussian) Jno DS SR hae 
Gamma pete 278 
> 
(I'-distribution) rye? 779 eae 
Exponential 
(gamma distribution ew rA* og >0 A>O0 
with a=1, B=1/d) 
t-distribution r(2 (1 + i TER PE 
(Student distribution) | yrnr (3) n : as. 
ak on * =a) t<l r>0, s>0 
(8-distribution) Bir, s) oan i 
Two-sided A 
exponential elt] geR A>0 
(Laplace distribution) 
Chi-squared 
x? distribution; 1 ree" %. r>0 n=1.2 
gamma distribution 27/2r(3) A Bii 


with a = n/2, p = 2) 


inverse Gaussian 


see (14) in § 1d 


o 

Cauch —_—_\—_,—_;-, TER ER, a >0 

” r((=-u)? +0?) ‘i 
b2 

Pareto cart tb a>0,b>0 
i l 1 tee 0 0 

og-norma e 20 , £> ER, o > 

7 OTN on a 

O Be (2+8) 

Logistic (4 e=(@48e))2” TER ae R, B >0 
U) a ` a, p,p, ð 
Hyperbolic see (2) in§ td sce (5) in § 1d 
Gaussian\\ a, p,p, ð. 


see (5) in §1d 
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If Dn = 0, i.e., 
Law(X® + X® +--+ XM) = Law(ne X), (32) 


then we call X a strictly stable random vector with exponent a or a strictly a-stable 
random vector. 


Remark. Along with using the notation Law(X) = Law(Y), one often writes X dy 
to indicate that the distributions of X and Y are the same; one says that X and Y 
coincide in distribution. The notation X?” $ X or Law( X”) — Law(X) indicates 
convergence in distribution (as already pointed out in subsection 2), i.e., weak con- 
vergence of the corresponding measures. (For more detail, see [439; Chapter III].) 
If X = (Xt)t>0 and Y = (¥4)t50 are two stochastic processes, then we write 


{Xant > 0} £ {¥%,t > 0} 


Or 
Law(X;,t > 0) = Law(Y;,t > 0) 


to denote that all the finzte-dimensional distributions of X and Y are the same, 
and we shall say that X and Y coincide in distribution. 


§ 1b. Lévy Processes 


1. The Lévy processes, which we discuss below, are certain stochastic processes 
with independent increments. They form one of the most important classes of 
stochastic processes, which includes such basic objects of probability theory as 
Brownian motions and Poisson processes. 

DEFINITION 1. We call a stochastic process X = (Xz)t>0 with state space Ri 
defined on the probability space (Q, F, P} a (d-dimensional) Lévy process if 

1) Xo = 0 (P-a.s.); 

2) for each n > 1 and each collection to,t1,..., 0 < to < ti <- < tn, the 
variables X¢,,Xt, — Xtg,---,Xt, — Xt,_, are independent (the property of 
independent increments); 

3) for alls >0 and t >0, 


d 
Xt+s — Xs = Xt- Xo 


(the ‘homogeneity’ property of increments); 
4) for allt > 0 and € > 0, 


lim P(|Xs — X| > e) =0 


(the property of stochastic continuity); 
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5) for P-almost all w € Q the trajectories (Xz(w))t50 belong to the space D? of 
(vector-valued) functions f = (fi)i>0, ft = (is fe, ods fi) with components 
fi= (fi)e>0; i=1,...,d, that are right-continuous and with limits from the 
left for all t > 0. 


Remark 1. If we ask that the process X = (X;)z>0 in this definition have only 
properties 1)-4), then it can be shown that there exists a modification X’ = (Xj)iy0 
of X = (Xihz0 (ie, P(X} A Xt) = 0 for t > 0) with property 5). Thus, the 
process X’ is the same as X as regards the properties 1)—4), but its trajectories are 
‘regular’ in a certain sense. For that reason, one incorporates property 5) of the 
trajectories in the definition of Lévy processes from the very beginning (without 
loss of generality). 


Remark 2. Duly interpreting conditions 1)-5) we can reformulate the definition of 
a Lévy process as follows: this is a stochastically continuous process with homoge- 
neous independent increments that starts from the origin and has right-continuous 
trajectories with limits from the left. 

A classical example of such a process is a d-dimensional Brownian motion 
X= (xi X?, vite aX), the components of which are independent standard Brown- 
ian motions X’ = (XPts0. T= loyd: 

It is instructive to define a (one-dimensional) Brownian motion separately, out- 
side the general framework of Lévy processes. 


DEFINITION 2. We call a continuous Gaussian random process X = (Xz)t>0 a 
(standard) Brownian motion or a Wiener process if Xo = 0 and 


EX; =0, i 
EX,X; = min(s, t). (1) 


By the Gaussian property and (1) we immediately obtain that this is a process 
with homogeneous independent (Gaussian) increments. Since 


Xp=Xy~N0,t=—28),  t2s, 


it follows that E|X; ~ X,|° = 3|t — s|?, and by the well-known Kolmogorov test 
([470]; see also (7) in §2c below) there exists a continuous modification of this 
process. Hence the Wiener process (the Brownian motion) is a Lévy process with 
an additional important feature of continuous trajectories. 


2. Lévy processes X = (Xt}tz0 have homogeneous independent increments, there- 
fore their distributions are completely determined by the one-dimensional distribu- 
tions Py(dx) = P(X; € dx). (Recall that Xp = 0.) By the mere definition of these 
processes, the distribution P;(dx) is infinitely divisible for each t. 
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Let 


yr(0) = Eel *t) — i elz) Pi (dx) (2) 
Rd 

be the characteristic function. Then, by formula (22) in § la (see also (24) in the 

same § la), 


p0) = exp fil By) = 5(0, C10) + f (eX) 1- i(0,s)E(z] < 1) nla) }, (3) 


Rt 


where B; € R¢, Cı is a symmetric nonnegative definite matrix of order d x d, and 
1%4(dx) is the Lévy measure (for each ¢) with property (23) in § la. 
The increments of Lévy processes are homogeneous and independent, therefore 


prss(9) = 9e(0)95(9), (4) 


so that 
yt(8) = exp{ty(6)}. (5) 


(The function 4% = (0) is called the cumulant or the cumulant function.) 
Since the triplets (Bi, Ct, vt) are unambigously defined by the characteristic 
function, it follows by (5) (see, e.g., [250; Chapter II, 4.19] for greater detail) that 


B; =t- B, Ci=t- C, (dz) = t-v(dz), (6) 


where B = Bi, C=C), and v = n. 
Hence it is clear that in (5) we have 


Y(0) = i(, B) — 500,08) f (el) 1- ia) < 1))v(ds} (7) 


3. The representation (5) with cumulant y(6) as in (7) is the main tool in the study 
of the analytic properties of Lévy processes. As regards the properties of their 
trajectories, on the other hand, the so-called canonical representation (see § 3a, 
Chapter VI and [250; Chapter II, § 2c] for greater detail) is of importance. It 
generalizes the canonical representations of Chapter II, § 1b for stochastic sequences 
H = (Hp)n zo (see (16) in §1b and also Chapter IV, § 3e) to the continuous-time 
case. 


4. We now discuss the meaning of components in the triplet (Bt, Ct, vejez0- Fig- 
uratively, (Bz)zp0 is the trend component responsible for the development of the 
process X = (Xz)z>0 on the average. The component (C;)zs0 defines the vari- 
ance of the continuous Gaussian component of X, while the Lévy measures (%):>0 
are responsible for the behavior of the ‘jump’ component of X by exhibiting the 
frequency and magnitudes of ‘jumps’. 
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Of course, this (fairly liberal) interpretation must be substantiated by precise 
results. Here is one such result (see [250; Chapter II, 2.21] as regards the general 
case). 

Let X = (Xz)e>0 be a process with ‘jumps’ |AX;| < 1, ¢ 2 0, let Xo = 0, and 
let (Bz, Ct, vtjtz0 be the corresponding triplet. Then EX? < œ, t > 0, and the 
following processes are martingales (see Chapter II, § 1c): 

a) Mp = Xt — By - Xo, t 20; 

b) MŽ - Ci, t20; 


c) f [9 pe(dax) — i ig g(x) 4(dx). t 20, 


where (A) = Ð I(AX, € A, AX; #0) is the measure of the ‘jumps’ of X on 
O<s<t 

the interval (0, t] and g = g(x) are continuous functions vanishing in a neighborhood 
of the origin. 

As already pointed out, a standard Brownian motion is a classical example of a 
continuous Lévy process (with By = 0, Ci = t, and 1% = 0). 

We now consider examples of discontinuous Lévy processes, which, at the same 
time, will give us a better insight into the concept of Lévy measure v = v(dr). 


5. The case of a finite Lévy measure (v(R) < oo). A classical example here 
is, of course, a Poisson process X = (Xz)i>09 with A > 0, i.e. (by definition), a Lévy 
process with Xo = 0 such that X; has the Poisson distribution with parameter At: 


eTA AHE 
k! í 


P(X; = k) = k=0,1,.... 


In this case we have By = At (= EX4), Cy = 0 and the Lévy measure is concentrated 


at a single point: 
v(dx) = AT, 1; (dz). 


The representation (3) takes now the following form: 
yr(0) = ep w0 + [te —1-i6xI(x < 1)) una} 
R 
= exp{ iaat +f (er —1-i0rI(x = 1)) Att syd) } 
Re 
= exp{At(e” —1)}. (8) 


It is worth noting that, starting form the Poisson process one can arrive at a wide 
class of purely jump Lévy processes. 
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Namely, let N = (Ne)t>9 be a Poisson process with parameter À > 0 and let 
E = (€;);51 be a sequence of independent identically distributed random variables 
(that must also be independent of N} with distribution 


_ A) 


Pleje A)= “2, Ac BR), 


where à = v(R) < œ and v({0}) = 0. 
We consider now the process X = (X¢)¢>09 such that Xo = 0 and 


Ni 
X=Y éj >00. (9) 
j=l 
Alternatively, this can be written as follows: 
foe} 
X= yo &GI(r < t), (10) 
j=l 


where 0 < 71 < T2 <--- are the times of jumps of the process N = (Nz )i50- 
A direct calculation shows that 


CO 
ye(0) = Eet = XO E(et®™t | N; = k) P(N; = k) 
k=0 


o -Aty pk 
= NO (Bet yk ae = expt fe” — 1) (ar)}. (11) 
k=0 


The process X = (X;)59 defined by (10) is called a compound Poisson process. It 
is easy to see that it is a Lévy process. The ‘standard’ Poisson process corresponds 
to the case £; = 1, j 2 1. 


6. The case of infinite Lévy measure (v(R) = co). To construct the simplest 
example of a Lévy process with v(IR) = oo we can proceed as follows. 

Let À = (Ag)g>1 be a sequence of positive numbers and let 8 = (Bk)k>1 be a 
sequence of nonzero real numbers such that 


5 Akb < œ. (12) 
k=1 
We set E 
v(dz) = 5 àk T{g,} (dz) (13) 
k=1 
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and let N(*) = Gearon? k > 1, be a sequence of independent Poisson processes 
with parameters Àp, k > 1, respectively. 
Setting 


Xf) = Y Be (NE — Art), (14) 
k=1 


it is easy to see that for each n > 1 the process x0) = (x) i50 is a Lévy process 
with Lévy measure 


m 
v™) (dx) = 5 Ax Tig} (dz) (15) 
k=1 
and 
gl”) 8) = Eet0 X” = exp t | (e — 1 — idx) v\™ dz) >. 16 
t 

The limit process X = (Xz), 

Xe = X B (NP — Art) (17) 
k=1 


(X; is the L?-limit of the x” as n — oo) is also a Lévy process with ‘Lévy 
measure’ defined by (13). 


Remark 3. In this case we have property 5) from Definition 1 because the X(™® are 
square integrable martingales, for which, by the Doob inequality (see formula (36) 
in § 3b and also [250; Chapter I, 1.43] or [439; Chapter VII, §3]), we have 


Emax |X3") — Xs? + 0 as no. 
SS 


[ec 
Remark 4. Since v(R) = $> Ax, v({0}) = 0 and 
k=1 


oo 
Le A1)v(dr) < 5 Akb? < œ, 
R k=1 
the measure v = v(dx) in (13) satisfies all the conditions imposed on a Lévy measure 
(see formulas (22)—(23) in § la). 
CO 
If $ Ag = œ, but (12) holds, then we have an example of a Lévy process with 
k=1 
Lévy measure v satisfying the relation v(R) = oo. 

We now present another well-known example of a Lévy process with v(R) = oo. 
We mean the so-called gamma process X = (Xt)t>9 with Xo = 0 and probability 
distribution P(X; < x) with density 

t-1,-2/B 
gre 
p(z) = Toe I(0,00) (2) (18) 


(cf. Table 2 in § la). 
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Now, 
yi(0) = (1 — i808) (19) 
We claim that the characteristic function y;(@) can be represented as follows: 
-o0 e72/B 
y0) = expt | (ae — 1) m dz}, (20) 
which yields the equality (dx) = tv(dz}, where 


e72/B 


v(dx) = 119,50) (2) dx. (21) 


Clearly, v(0, 00) = co, however 


[e A 1) (dx) < œ. (22) 
0 


We now proceed to the proof of the representation (20). 
We consider the Laplace transform 


Li(u) = Eet: = f ep (x) dx = (1+ Buy® 
0 
u 
exp{—tln(1 + Bu)} =f- f Ka } 
0 B +y 
u [oe] _2_ 
epf- [ ay f e ê "dsl 
0 0 
ee) e72/B 
efi | (e7? — 1) az}. 
J0 T 


Using analytical continuation in the complex half-plane {z = a + ib, a < 0} we 


obtain 
oo e72/B 
| e7” p(dx) = expt f (e — 1) ac}. 
JR 0 T 


Setting here z = i, we arrive at the required representation (20). 


T 


7. Now that we have obtained ‘explicit’ representations (10), (14), and (17) for 
several (jump) Lévy processes, we have at our disposal a method of their simula- 
tion. We uced only simulate the random variables €; and 6, and the exponentially 
distributed variables A; = 7; — Ti—1 (the time intervals between two jumps of the 
Poisson process at instants 7;.1 and 7;). For a simulation of infinitely divisible 
random variables in their turn, the question of their representations as functions 
of ‘elementary’, ‘standard’ random variables becomes now of interest. Look at an 
example demonstrating opportunities emerging here. Let X and Y be two inde- 
pendent random variables such that X > 0 (and X is arbitrary in the rest), while 
Y has an exponential distribution. Then, as shown by Ch. Goldie, their product 
XY is an infinitely divisible random variable. 

In the next section we shall show how one can ‘combine’ processes of simple 
structure and obtain stable processes as result. 
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§ 1c. Stable Processes 


1. We start with the case of a real-valued process X = (Xt)ze7 with an (arbitrary) 
parameter set T, which is defined on some probability space (Q, F, P). 

Generalizing the definition of a stable random vector (Definition 6 in § la) we 
arrive in a natural way to the following concept. 


DEFINITION 1. We say that a real random process X = (Xther is stable if for 
each k > 1 and all ¢1,...,t, in T the random vectors (Xz,,..., Xt,) are stable, i.e., 
all finite-dimensional distributions of X are stable. 


If all these finite-dimensional distributions are stable, then it is easy to see from 
their compatibility that they have the same stability exponent a. This explains the 
name a-stable that one often gives to such processes when one wants to point out 
some particular value of a. 

In what follows, we shall be mainly interested in a-stable processes X = (Xther 
that are incidentally Lévy processes (§1b). A suitable name for them is a-stable 
Lévy processes. 

In §1a.10 we presented two equivalent definitions of stable nondegenerate ran- 
dom vectors (see formulas (30) and (31) in la). In the case of stable (one- 
dimensional) processes X = (Xz)t>0 that are also Lévy processes, this equivalence 
brings us to the following result: a nondegenerate (see Definition 2 below) one- 
dimensional Lévy process X = (Xt)t>0 ts a-stable (a € (0,2]) if and only if for 
each a > 0 there exists a number D (depending on a in general) such that 


d 
{Xat t 20} S {aX + Dt}. (1) 
We now present several definitions relating to multidimensional processes 
X = (Xthz0- 


DEFINITION 2. We say that a random process (X;)¢509 taking values in R? is de- 
generate if Xı = yt (P-a.s.) for some y € R@ and all t > 0. Otherwise we say that 
the process X is nondegenerate. 
DEFINITION 3. We call a nondegenerate random process X = (Xz):>0 taking 
values in R? an a-stable Lévy process (a € (0,2]) if 

1) X is a Lévy process 
and 

2) for each a > 0 there exists D € R? (dependent on a in general) such that 


{Xan t 20} È {al/°xX, + Dt, t > 0} (2) 
or, equivalently, 


Law (Xat, t > 0) = Law(a!/*X; + Dt, t > 0). (3) 


208 Chapter III. Stochastic Models. Continuous Time 


DEFINITION 4. We call an a-stable Lévy process X = (Xz)ty0 a strictly a-stable 
Lévy process if D = 0 in (2) and (3), i.e., 


{Xat t > 0} $ {al/*x;, t > 0} (4) 


or, equivalently, 
Law(Xat, t > 0) = Law(a!/® Xs, t > 0). (5) 


Remark 1. Sometimes (see, e.g., [423]), a stable vector-valued Lévy random process 
X = (X+)z>0 is defined as follows: for each a > 0 there exist a number c and D € R” 
sucli that 


{Xan t 20} $ {cX + Dt, t > 0} (6) 
or, equivalently, 
Law(Xat, t > 0) = Law(cX; + Dt, t > 0). (7) 


(If D = 0, then one talks about strict stability.) It is remarkable that, as in the 
case of stable variables and vectors, we have c = a!/@ for nondegenerate Lévy pro- 
cesses, where œ is a universal (i.e., independent of a} parameter belonging to (0, 2]. 
This result, the proof of which can be found, e.g., in [423], explains the explicit 
involvement of the coefficient a!/% in Definitions 3 and 4. 


Remark 2. It is useful to note that the condition 
Law(Xaz, t > 0) = Law(cX;, t > 0) 


is precisely that of self-similarity. Thus, a-stable Lévy processes, for which c = al/@, 
are self-similar. The quantity H = 1/a here is called the Hurst parameter or the 
Hurst exponent. See § 2c for greater detail. 


Remark 3. If a process X = (Xz)¢>0 is a-stable (0 < œ < 2) and has simultancously 
the self-similarity property 


Law(Xat, t> 0) = Law(a™ Xz, t> 0), a > 0, 


but is not a Lévy process, then the formula H = 1/a fails. Various pairs (œ, H) can 
correspond to such processes, provided that 


a<l and 0< H< 1/ąa, 
or 
azi and O<H<1. 


(See [418; Corollary 7.1.11 and Figure 7.1].) 
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2. Since a-stable processes are particular cases of Lévy processes X = (X¢)z>0, the 
characteristic functions of which y;(@) = Ec*(9-*t) have the representation y:(0) = 
exp{ty)(0)} with cumulant (6) defined by formula (7) in §1b, one may well ask 
what are the parameters B, C, and v corresponding to these (a@-stable) processes. 
Of a particular interest here is the Lévy measure v = v(dx), which is ‘responsible’ 
for the magnitudes of jumps AX; = X; — Xı- of X = (Xz):30- 

We now present, following [423], the main results obtained in this direction. 


THEOREM 1. Let X = (Xt)ty>0 be a nondegenerate Lévy process in R? with triplet 
(B,C,v). Then 
1) the process X is a 2-stable (Lévy) process if and only if v = 0, i.e., if and 
only if it is Gaussian; 
2) the process X is strictly 2-stable (Lévy) process if and only if it is a Gaussian 
process with zero mean (B = 0). 


THEOREM 2. Let X = (Xt)ts0 be a nondegenerate Lévy process in R? with triplet 
(B,C,v). Assume that 0 < œ < 2. Then X is an a-stable (Lévy) process if and 
only if C = 0 and the Lévy measure v is as follows: 


(A) = i: A(dé) [ T Tarer O+ dr, = AE BIRA\ {0}, (8) 


where 2 is some nonzero finite measure in S = {x € R@: |x| = 1}. 
The cumulant ~(0) of such a process has the following representation: 


O= i108) + f aae) f (eitore —1-#6,ré)Toa(r))r OF™ dr. (9) 


Noteworthy, if 0 < œ < 2, then the radial part of the Lévy measure is r~(!+™ dr, 
With a decrease of a, r~('+® decreases for 0 < r < 1 and increases for 1 < r < 00. 
Thus, we can say that large jumps of the process trajectories prevail for œ close 
to zero, but the process is developing in small jumps if œ is close to two. Good 
illustrations to this property can be found in [253]. 


THEOREM 3. Let X = (Xz)t50 be a nondegenerate Lévy process in R? with triplet 
(B,C,v). 
1) Let a € (0,1). Then X is a strictly a-stable (Lévy) process if and only if it 
has the cumulant 


w(0)= f Aae f (E079 — 1)r- 04 ar (10) 


with some nonzero finite measure X in S, i.e., if and only if C = 0 and the 
‘drift’ (see (27) in § 1a) is zero. 
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2) Let œ € (1,2). Then X is a strictly a-stable (Lévy) process if and only 
if C = 0 and the ‘center’ (see (29) in § la) is zero (ie, EX = 0), or, 
equivalently, if and only if its cumulant is 


vo = facade) [O° (E88 -ire dr (11) 


with some nonzero finite measure À in S. 
3) Let a=1. Then X is a strictly 1-stable (Lévy) process if and only if 


W(0) = i(0, B) + f A(dê) [ (ers — 1 — i(8, rêo, 1])r? dr (12) 
for some finite measure À in : and some constant B such that 
fax =0 (13) 
and A(S) + |B| > 0. 


COROLLARY. Let X = (Xt)t30 be an a-stable Lévy process in R?. Ifa 41, then 
there always exists y € R? such that the centered process X = (Xt — yt)exo is 
strictly a-stable. 

On the other hand if œ = 1 and (13) holds, then the process X = (Xz)z50 is 
itself strictly 1-stable. 

For d = 1 we can explicitly calculate integrals in the representations (9)~(11) 
for cumulants to obtain 


ip — 0 (6|2(1 — i6(Sen 8) tan >), ek, (14) 
y0) = 9 
ip — o|0\(1 + i8=(Sgné)In[6]), a=, (15) 
us 
for 0 < a < 2 (cf. formula (6) in § 1a), where 8 € [-1,1], ø > 0, and p € R. 
A (nonzero) a-stable Lévy process is strictly stable if and only if 
p=0 for aF¥il, 
and 
B=0, o+ |p) >0 for a=}. 
3. We shall now discuss formulas (2) and (3) in the definition of an a-stable process. 
For ¢ = 1 formula (1) can be written as follows: 
Xa a!l X; + Da, (16) 


where Da is a constant. Using the representations (14) and (15) for the character- 
istic function of this process we obtain 


(a = all) y, a#l, 


Da = 2 
B—o%alna, a=1. 
T 


(17) 
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4. We see from the above that operating the distributions of stable process can 
be difficult because it is only in three cases that we have explicit formulas for the 
densities (see § 1a.5). 

However, in certain situations we can show a way for constructing stable pro- 
cesses, e.g., from a Brownian motion, by means of a random (and independent on 
the motion) change of time. 

We present an interesting example in this direction concerning the simulation 
of symmetric a-stable distributions using three independent random variables: uni- 
formly distributied, Gaussian, and exponentially distributed. 

Let Z = (%):50 be a symmetric a-stable Lévy process with characteristic func- 
tion 

yr(0) = Ee = eH", (18) 


where 0 <a < 2. 
It will be clear from what follows that the process Z can be realized as 


Z= Br, t20, (19) 


where B = (B:)tz0 is a Brownian motion with EB; = 0 and EB? = 2t, while 
T = (Tt)t>0 is a nonnegative nondecreasing $-stable random process, which is 
called a stable subordinator. We say of the process Z obtained by a transforma- 
tion (19) that it is constructed from a Brownian motion by means of a random 
change of time (subordination) T = (Tt)t>0- 

The process T = (T¢)¢50 required for (19) can be constructed as follows. 

Let U(® = Ul%(w) be a nonnegative stable random variable with Laplace 
transform 


Ee L eT, 5, (20) 


where 0 <a <l. 
Note that if U(® and U1, ..., Un are independent identically distributed random 
variables, then the sum 


nS (21) 
j=l 


has the same Laplace transform as U(®, so that U'®) is indeed a stable random 
variable. 

Assume that 0 < œ < 2. We now construct a nonnegative nondecreasing 
3-stable process T = (T;)¢>0 such that Law(T)) = Law(U(2/2)), 

By the ‘self-similarity’ property (5), 


Law(T;) = Law(t?/¢T,) = Law(#?/2u(@/2)) (22) 
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therefore 


: f $ 02 27; 
Eei02 — Ec Brn = E[E(e8% |T))] = Ee 2 = Ee? 


242 /a af2 242/œyaj2 = a 
Spe PENURIA 0242/0/2 L mtoe (23) 


ew! 


which delivers the required representation (19). 

Let p = p(z;a@) be the distribution density of the variable U = U(®, where 
0<a< 1. By [238] and [264], this density, which is concentrated on the half-axis 
x > 0, has the following ‘explicit’ representation: 


rai) [Pasaen{-(t) acaba, ey 


whiere 


(25) 


sin az 


oo Se 
sinaz \ T=% sin(1 — a)z 
sin z 


a(z a) = ( 


As observed by H. Rubin (see [264; Corollary 4.1]), the density p(x: œ) is the 
distribution density of the random variable 


(= (ea (26) 


7) 


where a = a(z; œ) is as in (25), € and 7 are independent random variables, is uni- 
formly distributed on [0,7], and 7 has the exponential distribution with parameter 
one, 


Remark. The fact that p(z; œ) is the distribution density of Ç can be verified with 
no difficulty. For let 


h(x) = 2 ic a) exp(—aa(z3a)) dz. 


T 


Then h = h(x) is obviously the density for n/a(€; a). A simple change of variables 
shows now that p(x; a) is the distribution density for ¢. 
Thus, 


Law(T,) = Law(U(/?)) = Law ( (622) i (27) 


and, by (22), 
Law(T;) = Law(t?/°71). (28) 
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From (27) and (28), denoting a Gaussian randon variable with zero mean and 
variance 2 by (0,2), we see that 


Law(Z — Zs) = Law(Bry, — Br,) = Law(Br,_7,) 
= =Law(/Te Ts ¥(0, 2)) = = Law( Ths (0,2)) 
= = Law((t — s)/*,\/T, (0, 2)) 


= Law (¢ - gue ( 262/2) * (0,2), 


n 


This representation for Law(Z; — Zs) shows a way in which, simulating three in- 
dependent random variables £, 7, and y = y(0,2), one can obtain an observation 
sample for the increments Z; — Zs of a symmetric a-stable random process. 


Remark. For general results on the construction of Lévy processes from other Lévy 
processes (of a simpler structure) by means of subordinators, see [47], [239], [409], 
and [483]. 

The process Z = (Z¢)¢>0 is used in [327] for the description of the behavior of 
prices (the Mandelbrot-Taylor model). It is worth noting that. if t is real, ‘physical’ 
tine, then T; can be interpreted as ‘operational’ time (see Chapter IV, § 3d) or 
as the randoin ‘number of transitions’ that occurred before time t. (This, slightly 


T, 

loose, interpretation is inspired by similarities with the sums 5 Ek of a random 
number Tn of random variables £p, k > 1.) k=1 

We must eniphasize that for each ¢ the distribution of Z; = Br, is a mixture 
of Gaussian distributions. In other words we can say that the distribution of the 
Zt is conditionally Gaussian. We have already discussed such distributions (see 
Chapter II, §1d and §3a). Below, in §1d, we shall consider models based on 
hyperbolic distributions, which are also conditionally Gaussian and belong to the 
class of infinitely divisible distributions, but are not stable. All this indicates that 
in our search for an adequate description of the evolution of prices we must, in a 
certain sense, look towards conditionally Gaussian distributions and processes. 


5. In conclusion we consider three cases of stable Lévy processes corresponding to 
the three cases of known explicit formulas for stable densities (see § 1a.5). 


EXAMPLE 1. A standard Brownian motion X = (Xz)¢30 in Rf is a strictly 2-stable 
Lévy process. The corresponding probability distribution P1 = P1(dz) of X1 is as 
follows: 

Pi(dz) = (20) 4/2¢7 le 7/2 dz, zE R. (29) 


The characteristic function of X; is 


x, (8) = Eel) = e7210, (30) 
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and one immediately sees that (cf. (14)) 


— at jg]2 earl, 2 
PX ae (0) = 07 FIN = ema — oy x (8). (31) 
EXAMPLE 2. A standard Cauchy process in R@ is a strictly 1-stable Lévy process 
with oar ; 

d41 — dt) 
Pi(de) = “Pr ( 244) G + tel) 2 dx, x € R¢, (32) 
The characteristic function of X; is 


ex, (9) = e441, oo) 
and, clearly (cf. (14)), 


etl] — e7 tla = 


PX (9) = Pax, (9). (34) 


EXAMPLE 3. For a one-sided strictly 4-stable Lévy process in (0,00) we have 
Pi (da) = (2r) Io oo (a)e/Ca-3/2 da, =f ER, (35) 


and 
px (0) = exp{—ty/|6] (1 — iSgn)}. (36) 
We immediately see that 


Pax, (0) = exp{ -at /|6| (1 — Sgn 4) } 
= exp{—ty/|a?6| (1 — iSgn(a70))} = vq? x, (0). (37) 


These examples virtually exhaust all known cases when the one-dimensional 
distributions of Xı (and therefore, of X+) can be expressed in terms of elementary 
functions. 


§1d. Hyperbolic Distributions and Processes 


1. In 1977, O. Barndorff-Nielsen [21] introduced a class of distributions that is 
interesting in many respects, the so-called generalized hyperbolic distributions. His 
motivation was the desire to find an adequate explanation for some empirical laws 
in geology; subsequently, these distributions found applications to geomorphology, 
turbulence theory, ..., and also to financial mathematics. 

Generalized hyperbolic distributions are not stable, but they are close to stable 
ones in that they can be characterized by several parameters of a close meaning 
(see subsection 2). 

We distinguish two distributions in this class that are most. frequent in applica- 
tions: 

1) the hyperbolic distribution in the proper sense; 
2) the Gaussian\\inverse Gaussian distribution. 
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It should be mentioned that these distributions are mixtures of Gaussian ones. 
Hence they are naturally consistent with the idea of the use of conditionally Gauss- 
ian distributions. At the same time, the distributions in question are infinitely 
divisible and form a fairly wide subclass of infinitely divisible distributions. Judg- 
ing by the behavior of their ‘tails’, they are intermediate between stable (with 
exponent œ < 2) and Gaussian distributions (with œ = 2): their ‘tails’ decrease 
faster than for stable (œ < 2}, but slower than for Gaussian distributions. 


2. The attribute ‘hyperbolic’ is due to the following observation. 
For a normal (Gaussian) density 


1 = z-u)? 
plz) = -= e a? (1) 
V2r7 


the graph of its logarithm In y(x} is a parabola, while the graph of the logarithm 
In hi (x) of the density 


hy (x) = C1(a, 6, 6) exp{ -ay 8? + (x — u)? + B(x — p) } (2) 


of a hyperbolic distribution is a hyperbola 


f(z) = nC (a, 8,8) ~ ay 8 + (z — u)? + B(x — p) (3) 
with asymptotes 
a(x) = ~a|z — p| + B(x - p). (4) 
The four parameters (a, 8, p, ô) in the definition (2) of a hyperbolic distribution 
are assumed to satisfy the conditions 


a>0, 0<|Bl<a, wER, 630. (5) 


The parameters a and 8 ‘govern’ the form of the density graph, yz is the location 
parameter, and ô is the scale parameter. The constant Cj has the representation 


vee (6) 
205K 1(6\/a2 — B2) 
where Kj(x) is the modified third-kind Bessel function of index 1 (see [23]). 


One often uses another set of parameters to describe a hyperbolic distribution. 
Namely, one sets 


C1(a, 8, 6) = 


a=5(v+7) and p= 5(e-), (7) 


so that yy = a? — B?. 
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We shall denote the density hı (x) = hi (x; a, 8B, p, 4d) expressed in terms of these 
new parameters by ho(r) = ho(z:y, 7, 4,6). It has the following representation: 


hala) = Calp y exp -zlo + r/o? + (2-n + 5(e-Ve-w}, 8) 


where l 
Colp, 7,9) Ta 6xKy (dx) 
with x = (yy)!/? and w7! = (p7! +71). (The last parameters are used in [289].) 
It is clear from (8) that if a random variable X has density ho(x;y, Y, 4,4), then 
the variable Y = (X — a)/b, where a € R and b > 0, has the distribution density 
h(x; by, by, 6/b, (x — a)/b). Thus, the class of hyperbolic distributions is invariant 
under shifts and scaling. 
It is also clear from (8) that h(x) > 0 for all z € R, and the ‘tails’ of h(x) 
decrease exponentially at the ‘rate’ y as x > —oo and at the ‘rate’ y as rT > oo. 
As 6 > œ, 6/x +07, and yp — y + 0, we have 


h2(z; p, 7,2, 9) > p(z), 
where y(x) = v(x; p, o?) is the normal density. 


As 6 — 0, we obtain in the limit the Laplace distribution (which is asymmetric 
for y # y) with density 


= 1 1 
ACEP, VH) = w ‘exp{-5(e + lz — pl + she e-e} 

Introducing the parameters 

€=(1+6Ve7)? (=(1+8vVa-83)"?) 
and 

P Ste: [B 

x=- E (57E) 
we notice that they do not change when one passes from a random variable X 
with hyperbolic density ho(z; Y, Y, 4,6) to a random variable Y = X — a with 
density ho(x; p, Y, 4— a, 6) indicated above. These parameters £ and x, which have 
the meaning of skewness and kurtosis, are good indicators of the deviation of a 


distribution from normality (see Chapter IV, § 2b for greater detail). 
We note that the range of (x, £) is the interior of the triangle 


V= {%8 0<lxl<£<1} 
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(see Fig. 26; here M corresponds to the normal distribution, £ to the exponential, 
and Z to the Laplace distribution). 


E 


+ 
=i 0 1 
FIGURE 26 


The boundary point (0,0) ¢ V corresponds to the normal distribution; the 
points (—1,1) and (1,1), also lying outside V, to the exponential distribution, and 
(0,1) ¢ V to the Laplace distribution. Passing to the limit as x —> +€ we obtain 
(see [21]-[23], [25], [26]) the so-called generalized inverse Gaussian distribution. 

It was mentioned in [21], [23], [25], [26], [22] that a hyperbolic distribution is a 
mixture of Gaussian ones: if a random variable X has density hı (x; a, B, p, ô), then 


Law X = E N (u + Bo? o?”), (9) 


where by E’ 2 we mean the result of averaging with respect to the parameter o? 


that has the inverse Gaussian distribution with density 


p2() = ma- (ax + 2)}; (10) 


here a = a? — 8? and b = 62. 


3. We now consider another representative of Barndorff-Nielsen’s class of gener- 
alized hyperbolic distributions [21], the so-called Gaussian\\inverse Gaussian dis- 
tribution (G/G-distribution). (Later on, Barndorff-Nielsen started using also the 
name ‘normal inverse Gaussian distributions’ [22].) 

In the spirit of symbolical formula (9) we can define the Gaussian\\inverse 
Gaussian distribution of a random variable Y as follows: 


Law Y = E”, N (u + Bo’, 0”), (11) 


where we consider averaging E”, of the normal distribution N (p + Bo, o?) with 
respect to the inverse Gaussian distribution with density 


/b 1 1 b 
pio (x)= = Os expl 5 (ax + Hy, (12) 
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where a = a? — 9, b = 67. (The parameters a, 3, p, and d must satisfy condi- 
tions (5).) 

It is interesting that if W = (Wz)z>0 is a standard Brownian motion (a Wiener 
process) and 


T(t) =inf{s > 0: W, + Vas > vot} 


is the first time when the process (W5+,/a s)s>0 reaches the level Vbt then T(1) has 
just a distribution with density (12). Hence if B = (Br):>0 is a standard Brownian 
motion independent of W, then Y has the same distribution as the variable 


Bray + (e + T). (13) 
(Cf. formula (19) in § 1c.) 


Now let g(x) = g(x; a, B, p, ô) be the density of a GIG-distribution. 
By (11) and (12), 


g(a) = C3(c, B, 1,6) (5A) Tk (o (254) ee), (14) 


where 


C3(ar, 8, 1,8) = = VP 
and q(z) = V1 + g2. 


Since 
K(x) ~ Yao ce as T> oo, (15) 
it follows that 


son (85) r VEEE) 08 


as |x| — oo, therefore 


ae ~Gin(i + (F54)’), |x| > oo. (17) 


The last relation shows that hı (x) has ‘heavier’ tails than g(x). 


4. A hyperbolic distribution (with density hy(z)) has a simpler structure than 
a Gaussian\\inverse Gaussian distribution (GIG-distribution) with density g(z). 
However, one crucial feature of the latter distribution makes it advantageous in 
certain respects. 
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Let Y be a random variable with density g(a) = g(x; œ, 8, uz, 6). Then its prob- 
ability generating function is 


Ear = exp{o| Va? + Bt —sfa2— (B+ a2] i mA}. (18) 


Hence if Y1,..., Ym are independent GIG-distributed variables with the same val- 
ues of œ and (3, but with (generally speaking) distinct x; and 6;, then the sum 
Y = Yi +-+ Yn is also a GIG-distributed variable with the same parameters a 
and 8, but with y = yi +++: + Hm and ô = 61 + + im. 

In other words, the GIG-distributions are closed (in the above sense) with re- 
spect to convolutions. 

On the other hand, if X has a hyperbolic distribution, then setting for simplicity 
8 = u = 0 we obtain 


EeàX — Q K1(6V.a? — A2) 
: Kilas) Va- X ` 


(19) 


Hence the class of hyperbolic distributions is not closed in a way similar to GIG- 
distributions. 

It is important to point out that both GIG-distribution and hyperbolic distribu- 
tion are infinitely divisible. For GIG-distributions this is clearly visible from (18), 
while for hyperbolic distribution this is proved in [21]-[23], [25], [26]. By (18) we ob- 
tain also the following simple formulae for the expectation EY and the variance DY: 

ô S 5 
oo ae BS a) 273/2 ` 
[1 — (§)"] ofl — (3)"] 


Remark. As regards the applications of these distributions to the analysis of fi- 
nancial indexes, see E. Eberlein and U. Keller [127], where one can find impres- 
sive results of the statistical processing of several financial indexes (see also Chap- 
ter II, § 2b). 


EY =p+ 


5. Since hyperbolic distributions are infinitely divisible, we can define a Lévy pro- 
cess (i.e., a process with independent homogeneous increments) with hyperbolic 
distribution of increments. 

We restrict ourselves to the case of symmetric centered densities hy(a) = 
hy (a; a@, 8, p, ô) with parameters 8 = p = 0. In this case h(x) has the following 
representation: 


hy(z) = BK lad exp ad / i + EY \ (20) 


Let Z = (Zt)tz0 be the Lévy process such that Zı has the distributions with 
density (20). 
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It is worth noting that since E|Z;| < oo, Zo = 0, and Z = (Z+) has independent 
increments, this process is a martingale (with respect to the natural flow of o- 
algebras (Ft), Ft = o(Zs,8 S t)), ie., 

E(Zi| Fs) = Zs, ts. (21) 
(Actually, E|2Z;|? < œ for all p > 1.) 

Let y;(@) = Eexp(ifZ:) be the characteristic function of the hyperbolic Lévy 

process Z = (Z+)t>0. Then (cf. formulas (4) and (5) in § 1b) 


l0) = (91 (8))". (22) 
By (9) and (10) (with X = Z1) we obtain 
Law Z1 = EL, M (0,0°) (23) 


(because 3 = p = 0), where 


! ( \= (5) ex ol a (24 

Poit) = SKi Pa a) f ) 

Based on this representation, the following ‘Lévy—Khintchine’ formula for y:(0) was 
put forward in [127] (cf. formulas (22) and (29) in § la): 


yrt(0) = expt [7 (d —1-i6x) vac), (25) 


where the Lévy measure v has the density p, (x) (with respect to Lebesgue measure) 
with the following (fairly complicated) representation: 


1 œ  exp(-|e|y2y + a?) exp(—|z|) 
pulz) =n DIE z dy + . 
m\x| Jo y(JP(6V2y) + YP (6V2y)) |x| 
Here J, and Yj, are the Bessel functions of the first and second kinds, respectively. 
In view of the asymptotic properties of Jı and Yı (formulas 9.1.7,9 and 9.2.1,2 


in [1]), one can show (see [127]) that the denominator of the integrand in (26) is 
-1/2 


(26) 


asymptotically a constant as y — 0 and is y as y — œ. Hence 


1 
pulz) ~ gasar > 0, (27) 


which shows that the process Z = (21):59 makes infinitely many small jumps on 
each arbitrarily small time interval. 
Indeed, let 


u? (w; A,B) = S°1(Z.(w)- Zs-(w) € B), B € A@(R\ {0}), 
sEA 
be the jump measure of Z, i.e., the number of s € A such that the difference 
Zs(w) — Zs- (w) lies in the set B. Then (see, e.g., [250; Chapter II, 1.8]) 
Eu” (w; A, B) = |A| v(B), 


therefore J 


o n = œ and fegt = œ for each € > 0. 


2. Models with Self-Similarity. Fractality 


It has long been observed in the statistical analysis of financial time series that 
many of these series have the property of (statistical) self-similarity; namely, the 
structure of their parts ‘is the same’ as the structure of the whole object. For 
instance, if the Sn, n > 0, are the daily values of S&P500 Index, then the empirical 
densities file) and fela), k > 1, of the distributions of the variables 


Sn ) ( Skn ) 
} =In and | = In -= }, nl, 
a ( Sn-1 on Sk(n—1) 


calculated for large stocks of data, satisfy the relation 


Fila) = k" fi, (k¥ x), 


where H is some constant (which, by the way, is significantly larger than 1/2—by 
contrast to what one would expect in accordance with the central limit theorem). 
Of course, such properties call for explanations. As will be clear from what 
follows, such an explanation can be provided in the framework of the general concept 
of (statistical) self-similarity, which not only paved way to a fractional Brownian 
motion, fractional Gaussian noise, and other important notions, but was also crucial 
for the development of fractal geometry (B. Mandelbrot). This concept of self- 
similarity is intimately connected with nonprobabilistic concepts and theories, such 
as chaos or nonlinear dynamical systems, which (with an eye to their possible 
applications in financial mathematics) were discussed in Chapter II, § 4. 


§ 2a. Hurst’s Statistical Phenomenon of Self-Similarity 


1. In 1951, a British climatologist H. E. Hurst, who spent more than 60 years in 
Egypt as a participant of Nile hydrology projects, published a paper [236] devoted 


222 Chapter III. Stochastic Models. Continuous Time 


to his discovery of the following surprising phenomenon in the fluctuations of yearly 
run-offs of Nile and several other rivers. 


Let 21,22,...,2n be the values of n successive yearly water run-offs (of Nile, 
n 

say, in some its part). Then the value of Xn, where Xn = = zp, is a ‘good’ 
estimate for the expectation of the xp. k=1 


The deviation of the cumulative value X; corresponding to k successive years 
from the (empirical) mean as calculated using the data for n years is 


k 
Xp—- —Xn, 
n 


and i ; 
min (x — Exa) and pax( Xp — Ean) 
n kxn n 


kxn S 


are the smallest and the biggest deviations. Let 


k ; k 
Ra = par e E E) 


_ 


be the ‘range’ characterizing the amplitude of the deviation of the cumulative values 
Xk from their mean value EX, over n successive years. 
However, Hurst did not operate with the values of the Rp themselves; he con- 


sidered instead the normalized values Qn = Rn/Sn, where 


1 n 1 n 2 
saD (5) 
k=1 


k=1 


is the empirical mean deviation introduced in order to make the statistics invariant 
under the change 
Lp > c(zp +M), k>1. 


This is a desirable property because even the expectation and the variance of the 
zk are usually unknown. 

Based on the large volume of factual data, the records of observations of Nile 
flows in 622-1469 (i-e., over a period of 847 years), H. Hurst discovered the following 
behavior of the statistics Rn /Sn for large n: 


sawn cn, (1) 
where c is a certain constant, the equivalence ‘~’ is interpreted in some suitable 
sense, and the parameter H, which is now called the Hurst parameter or the Hurst 
erponent, is approximately equal to 0.7. (H. Hurst obtained close values of this 
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parameter also for other rivers.) This was an unexpected result; he had antici- 
pated that H = 0.5 for the reason that we explain now following a later paper by 
W. Feller [157]. 

Let 41, 29,... be a sequence of independent identically distributed random vari- 
ables with Ex, = 0 and Ex? = 1. (This is what Hurst had expected.) Then, as 
shown by Feller, 


f 2 
T T T 
ERn ~ ae and DRn~ (S=<)n 


for large n. Since, moreover, Sn —> 1 (with probability one) in this case, the values 
of Qn must increase (on the average, at any rate) as n'/2 for n large. 

In a study of the statistical properties of the sequence (£n)n>1 it makes sense to 
ask about the structure of the empirical distribution function Law(z1 + +++ +n) 
calculated from a (large) number of samples, say, (£1,...,2n). (n41,---,Zan),--- 
In the case when the x; are the deviations of the water level from some ‘mean’ 
value, one finds out (e.g., for Nile again) that 


Law(z1 +--+ + £n) © Law(n™2), (2) 


where H > 1/2. 


2. How and thanks to what probabilistic and statistical properties of the sequence 
(xn) in (2) can the parameter H be distinct from 0.5? 

Looking at formula (4) in § 1a we see one of the possible explanations for the 
relation H ~ 0.7: the x; can be independent stable random variables with stability 
exponent & = 4 ~ 1.48. 

There exists another possible explanation: relation (2) with H 4 1/2 can occur 
even in the case of normally distributed, but dependent variables z1, £2,...! In that 
case the stationary sequence (zn) is necessarily a sequence with strong aftereffect 
(see § 2c below.) 


3. Properties (1) and (2), which can be regarded as a peculiar form of self-similarity, 
can be also observed for many financial indexes (with the hn in place of the zp). 
Unsurprisingly, the above observation that the £n can be ‘independent and stable’ or 
‘dependent and normal’ has found numerous applications in financial mathematics, 
and in particular, in the analysis of the ‘fractal’ structure of ‘volatility’. 

Hurst’s results and the above observations served as the starting point for 
B. Mandelbrot, who suggested that, in the Hurst model (considered by Mandelbrot 
himself) and in many other probabilistic models (e.g., in financial mathematics), 
one could use strictly stable processes (§ 1c) and fractional Brownian motions (§ 2c), 
which have the property of self-similarity. 

It should be pointed out that a variety of real-life systems with nonlinear dy- 
namics (occurring in physics, geophysics, biology, economics, ...} are featured by 
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self-similarity of kinds (1) and (2). It is this property of self-similarity that oc- 
cupies the central place in fractal geometry, the founder of which, B. Mandelbrot, 
has chosen the title “The Fractal Geometry of Nature” for his book [320], so as to 
emphasize the universal character of self-similarity. 

We present the necessary definitions of statistical self-similarity and fractional 
Brownian motion in § 2c. The aim of the next section, which has no direct relation 
to financial mathematics and was inspired by [104], [379], [385], [386], [428], [456], 
and books by some other authors, must give one a general notion of self-similarity. 


§2b. A Digression on Fractal Geometry 


1. It is well known that the emergence of the Euclidean geometry in Ancient Greece 
was the result of an attempt to reduce the variety of natural forms to several 
‘simple’, ‘pure’, ‘symmetric’ objects. That gave rise to points, lines, planes, and 
most simple three-dimensional objects (spheres, cones, cylinders, ...). 

However, as B. Mandelbrot has observed (1984), “clouds are not spheres, moun- 
taines are not cones, coastlines are not circles, and bark is not smooth, nor does 
lightning travel in a straight line ...” Mandelbrot has developed the so-called frac- 
tal geometry with the precise aim to describe objects, forms, phenomena that were 
far from ‘simple’ and ‘symmetric’. On the contrary, they could have a rather com- 
plex structure, but exhibited at the same time some properties of self-similarity, 
self-reproducibility. 

We have no intention of giving a formal definition of fractal geometry or of a 
fractal, its central concept. Instead, we wish to call the reader’s attention to the 
importance of the idea of fractality in general, and in financial mathematics in 
particular and give for that reason only an illustrative description of this subject. 

The following ‘working definition’ is very common: “A fractal is an object whose 
portions have the same structure as the whole.” (The word ‘fractal’ introduced 
by Mandelbrot presumably in 1975 [315] is a derivative of the Latin verb fractio 
meaning ‘fracture, break up’ [296]). 

A classical example of a three-dimensional object of fractal structure is a tree. 
The branches with their twigs (‘portions’) are similar to the ‘whole’, the main trunk 
with branches. 

Another graphic example is the Sierpinski gasket? (see Fig. 27) that can be 
obtained from a solid (‘black’) triangle by removing the interior of the central 
triangle, and by removing subsequently the interiors of the central triangles from 
each of the resulting ‘black’ triangles. 

The resulting ‘black’ sets converge to a sct called the Sierpinski gasket. This 
limit set is an example of so-called attractors. 


W. Sierpiiiski (1882-1969), a Polish mathematician, invented the ‘Sierpifski gasket’ 
aud the ‘Sierpinski carpet’ as long ago as 1916. 
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A. 


WY, W 


FIGURE 27. Consequtive steps of the construction of the Sierpinski gasket 


Obviously, there are many ‘cavities’ in the Sierpinski gasket. Hence there arises 
a natural question on the ‘dimension’ of this object. 

Strictly speaking, it is not two-dimensional due to these cavities. At the same 
time, it is certainly not one-dimensional. Presumably, one can assign some fractal 
(ie., fractional) dimension to the Sierpinski gasket. (Indeed, taking a suitable 
definition one can see that this ‘dimension’ is 1.58 ...; see, e.g. [104], [428], or [456].) 

The Sierpinski gasket is an example of a symmetric fractal object. In the nature, 
of course, it is ‘asymmetric fractals’ that dominate. The reason lies in the local 
randomness of their development. This, however, does not rule out its determinism 
on the global level; for a support we refer to the construction of the Sierpinski 
gasket by means of the following stochastic procedure [386]. 

We consider the equilateral triangle with vertices A = A(1,2), B = B(3, 4), and 
C = C(5,6) in the plane (our use of the integers 1,2,...,6 will be clear from the 
construction that follows). 

In this triangle, we choose an arbitrary point a and then roll a ‘fair’ die with 
faces marked by the integers 1, 2,...,6. If the number thrown is, e.g., ‘5’ or ‘6’, then 
we join with the vertex C = C(5,6) and let be the middle of the joining segment. 
We roll the dice again and, depending on the result, obtain another point, and so 
on. 

Remarkably, the limit set of the points so obtained is (‘almost always’) the 
Sierpinski gasket (‘black’ points in Fig. 27). 

Another classical, well known to mathematicians example of a set with fractal 
structure is the Cantor set introduced by G. Cantor (1845-1918) in 1883 as an 
example of a set of a special structure (this is a nowhere dense perfect set, i.e., a 
closed set without isolated points, which has the cardinal number of the continuum). 
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We recall that this is the subset of the closed interval [0, 1] consisting of the numbers 


co y 
representable as > 7 with c; = 0 or 2. Geometrically, the Cantor set can be 
i=1 

obtained from [0,1] by deleting first the central (open) interval (3, 3), then the 
central subintervals (j,§) and (%,§) of the resulting intervals [0, x] and [3,1], 
and so on, ad infinitum. The total length of the deleted intervals is 1; nevertheless, 
the remaining ‘sparse’ set has the cardinal number of the continuum. The self- 
similarity of the Cantor set (i.e., the fact that ‘its portions have the same structure 
as the whole set’) is clear from our geometric construction: this set is the union of 
subsets looking each like a reduced copy of the whole object. 

Among other well-known objects with property of self-similarity we mention the 
‘Pascal triangle’ (after B. Pascal), the ‘Koch snowflakes’ (after H. van Koch), the 
Peano curve (G. Peano), and the Julia sets (G. Julia); see, e.g., [379]. 


2. In our above discussion of ‘dimension’ we gave no precise definition (F. Haus- 
dorff, who invented the Hausdorff dimension, pointed out that the problem of an 
adequate definition of ‘dimension’ is a very difficult one). Referring the reader 
to literature devoted to this subject (e.g., [104], [428], [456]), we note only that 
the notion of a ‘fractal dimension’ of, say, a plane curve is fairly transparent: it 
must show how the curve sweeps the plane. If the curve in question is a realiza- 
tion X = (Xt)t>0 Of some process, then its fractal dimension increases with the 
proportion of ‘high-frequency’ components in (Xz):50. 


3. Let now X = (Xz)t>0 be a stochastic (random) process. In this case, it is 
reasonable to define the ‘fractal dimension’ for all the totality of realizations, rather 
than for separate realizations. This brings us to the concept of statistical fractal 
dimension, which we introduce in the next section. 


§ 2c. Statistical Self-Similarity. Fractional Brownian Motion 


1. DEFINITION 1. We say that a random process X = (X¢)¢>0 with state space R? 
is self-similar or satisfies the property of (statistical) self-similarity if for each a > 0 
there exists b > 0 such that 


Law(Xat, t > 0) = Law(bXz, t > 0). (1) 


In other words changes of the time scale (t — at) produce the same results as 
changes of the phase scale (x > bz). 

We saw in § 1c that for (nonzero) strictly stable processes there exists a constant 
H such that b = a™. In addition, for strictly a-stable processes we have 


H=. (2) 
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In the case of (general) stable processes, in place of (1), we have the property 
Law(Xat, t > 0) = Law(a X; + tDa, t > 0) (3) 


(see formula (2) in § 1c), which means that, for these processes, a change of the time 
scale produces the same results as a change of the phase scale and a subsequent 
‘translation’ defined by the vector tDa, t > 0. Moreover, H = 1/a for a-stable 
processes. 


2. It follows from the above, that it would be reasonable to introduce the following 
definition. 


DEFINITION 2. If b= a™ in Definition 1 for each a > 0, then we call X = (Xz)iy0 
a self-similar process with Hurst exponent H or we say that this process has the 
property of statistical self-similarity with Hurst exponent H. The quantity D = + 


H 
is called the statistical fractal dimension of X. 


A classical example of a self-similar process is a Brownian motion X = (Xtjh>0. 
We recall that for this (Gaussian) process we have EX; = 0 and EX;X; = min(s, t). 
Hence 

EXasXat = min(as, at) = a min(s, t) = E(a1/*x,)(al/?X,), 


so that the two-dimensional distributions Law(X;, Xz) have the property 
Law (Xas, Xat) = Law (a!/? Xs, a? X). 


By the Gaussian property it follows that a Brownian motion has the property of 
statistical self-similarity with Hurst exponent H = 1/2. 
Another example is a strictly a-stable Lévy motion X = (Xt)}tz0, which satisfies 
the relation 
Xt — Xs ~ Sal(t — s)!/?,0,0), see (0,2. 


For this process with homogeneous independent increments we have 


d 
Xat = Xas = a"/® (Xi — Xe), 


so that the corresponding Hurst exponent H is 1/a and D = a. For @ = 2 we 
obtain a Brownian motion. 
We emphasize that the processes in both examples have independent increments. 
The next example relates to the case of processes with dependent increments. 


3. Fractional Brownian motion. We consider the function 
A(s, t) = [sE + jt? —|t-s|?7#, os, ER. (4) 


For 0 < H < 1 this function is nonnegative definite (see, e.g., [439; Chapter II, § 9]), 
therefore there exists a Gaussian process on some probability space (e.g., on the 


228 Chapter III. Stochastic Models. Continuous Time 


space of real functions w = (wt), t € R) that has the zero mean and the autocovari- 
ance function 


Cov(X5, Xt) = 5 Als, t), 


i.e., a process such that 
1 
EXX; = 5 {|| + t? — lt- s1™}. (5) 


Hence 
EXasXat = a FEX, Xı = E(a™X,)(a" X4), 


so that 
Law(Xas, Xat) = Law(a Xs, a” X¢). 


As in the case of a Brownian motion, whose distribution is completely determined 
by the two-dimensional distributions, we conclude that X is a self-similar process 
with Hurst exponent H. 
By (5), 
E|X: — X,| = |t- s|™. (6) 


We recall that, in accordance with the Kolmogorov test ([470]), a random process 

X = (Xt}>0 has a continuous modification if there exist constants a > 0, B > 0, 
and c > 0 such that 

E|X; — Xs|® < c|t — sft? (7) 


for all s,t > 0. Hence if H > 1/2, then it immediately follows from (6) (regarded 
as (7) with œ = 2 and 8 = 2H—1) that the process X = (X¢)¢>0 under consideration 
has a continuous modification. Further, if 0 < H < 1/2, then by the Gaussian 
property, for each 0 < k < H we have 


E|Xi — X,|1/E ele s|” 


with some constant c > 0. Hence we can apply the Kolmogorov test again (with 
a=1/k and 8 = H/k — 1). 

Thus, our Gaussian process X = (Xt):>0 has a continuous modification for all 
H,O<H<l. 


DEFINITION 3. We call a continuous Gaussian process X = (X¢)t>0 with zero 
mean and the covariance function (5) a (standard) fractional Brownian motion 
with Hurst self-similarity exponent 0 < H < 1. (We shall often denote such a 
process by By = (Bu(t))z>0 in what follows.) 


By this definition, a (standard) fractional Brownian motion X = (X;);50 has 
the following properties (which could also be taken as a basis of its definition): 


1) Xo = 0 and EX; = 0 for all ¢ > 0; 
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2) X has homogeneous increments, i.e., 
Law(Xt+s — Xs) = Law(X+), s,t >0; 
3) X is a Gaussian process and 
EX? =|", t20, 


where 0< H<1; 
4) X has continuous trajectories. 


These properties show again that a fractional Brownian motion has the self- 
similarity property. 

It is worth pointing out that a converse result is also true in a certain sense 
([418; pp. 318 -319]): if a nondegenerate process X = (Xt)jtz0, Xo = 0, has finite 
variance, homogeneous increments, and is a self-similar process with Hurst expo- 
nent H, then 0 < H < 1 and the autocovariance function of this process satisfies 
the equality Cov(Xs, Xt) = EX? A(s, t), where A(s, t) is defined by (4). Moreover, if 


0< H <1, then the expectation EX; is zero, and if H = 1, then X; d tX 1 (P-as.). 
We note also that, besides Gaussian processes, one knows also of some nonGaus- 
sian processes with these properties (see [418; p. 320]). 


4. If H = 1/2, then a (standard) fractional Brownian motion is precisely a (stan- 
dard) Brownian motion (a Wiener process). 

The processes By so introduced were first considered by A. N. Kolmogorov 
in [278] (1940), where they were called Wiener helices. The name ‘fractional Brown- 
ian motion’ was introduced in 1968, by B. Mandelbrot and J. van Ness [328]. By 
contrast to Kolmogorov, who had constructed the process By starting from the 
covariance function (4), Mandelbrot and van Ness used an ‘explicit’ representa- 
tion by means of stochastic integrals with respect to a (certain) Wiener process 
W = (Wi)teg with Wo = 0: for 0 < H < 1 they set 

0 t 
Ba = eaf f [t - s)#-1/2 — (—s)#-1/2] aW, + [ (t— FWP aw}, (8) 
00 J0 


where the (normalizing) constant 


Ait 2HT (3 — H) 9) 
"~ Vr +H)r(2— 2H) 


is such that EB2(1) = 1. 


Remark 1. As regards various representations for the right-hand side of (8), 
see [328]. In this paper one can also find a comprehensive discourse on the origins 
of the term fractional Brownian motion and its relation to the fractional integral 


fe — s)E71/2 aW, (10) 
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of Holmgren-Riemann-—Liouville, where H can be an arbitrary positive number (see 
also Weyl [475]). 


Remark 2. Almost surely, the trajectories of a fractional Brownian motion Bp, 
0 < H < 1, satisfy the Hélder condition with exponent 8 < H. They are incidentally 
nowhere differentiable and 

—— | Bu(t) — Bu(t 

lim Bult) — Bu(to)) _ oo (P-a.s.) 

toto t— to 
for each to > 0. (The corresponding proof can be carried out in the same way as 
for the standard Brownian motion, i.e., for H = 1/2; see, e.g., [123]). 


Remark 3. Considering a Hölder function H (|Het — Hs| < cht — s|, œ > 0) with 
values in (0,1) in place of H in (8) we obtain a random process that is called a 
multifractional Brownian motion. In was introduced and thoroughly investigated 
in [381]. 

Remark 4. In the theory of stochastic processes one assigns a major role to semi- 
martingales, a class for which there exists well-developed stochastic calculus (see §5 
and, for more detail, [250] and [304]). It is worth noting in this connection that a 
fractional Brownian motion By, 0 < H < 1, ts not a semimartingale (except for the 
case of H = 1/2, i.e., the case of a Brownian motion, and H = 1). See [304; Chap- 
ter 4, §9, Example 2] for the corresponding proof for the case of 1/2 < H < 1. 


Remark 5. In connection with Hurst’s R/S-analysis (i.e., analysis based on the 

study of the properties of the range, empirical standard deviation, and their ratio), 

it could be instructive to note following [328] that if X = (X¢)¢50 is a continu- 

ous self-similar process with Hurst parameter H, Xp = 0, and Ry = sup X; — 
O<s<t 

inf Xs, then Law(R,) = Law(t™R1), t > 0. In the case of a Brownian motion 

8 


VS 


(H = 1/2) W. Feller found a precise formula of the distribution of Ry. (Its density 
lee) 

is 8 Y (—1)*-1k? (kx), x > 0, where v(x) = (277) 1/2 exp(—x?/2).) 
k=1 

5. There exist various generalization of self-similarity property (1). For example, 


let X(a@) = (Xt(@))t>0 be a process of Ornstein-Uhlenbeck type with parameter 
a € R, i.e., a Gaussian Markov process defined by the formula 


t 
Xa) = f er) awe, £20; (11) 


where W = (W;s)s>0 is a standard Brownian motion; see §3a. (Note that X(a) 
(Xz(@))z>0 is the solution of the stochastic linear differential equation dX;(a) 
aXı(a) dt + dW;, Xo(a) = 0.) From (11) it easily follows that 

Law(Xat(@), t € R) = Law (a!/? X;(aa), tE R) 


for each a € R, which can be regarded as a self-similarity (of a sort) of the family 
of processes {X (a), œ € R}. 
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6. We now dwell upon one crucially important method of statistical inference that 
is based on self-similarity properties. 
Let X = (X¢t)ty0 be a self-similar process with self-similarity exponent As- 
sume that A > 0. Then 
Law(X,) = Law(AË X1), (12) 


dP(X, < 2) 


so that if fa (z) = de 


then 


is the density of the probability distribution of XA, 


fila) = AË falca). (13) 


It is common in the usual statistical analysis that, based on considerations of 
the general nature, one can assume X to be self-similar with some, unknown in 
general, value of the parameter H. Assume that we have managed to find a ‘likely’ 
estimate H for H. Then the verification of the conjecture that X is indeed a self- 
similarity process with parameter Ĥ can be carried out as follows. 

Assume that, based on independent observations of the values of X; and Xq, 
we can construct empirical densities fila) and fa(z) for sufficiently many values 
of A. Then if 


file) ~ AË fa (cA) (14) 


for a wide range of values of x and A, then we have a fairly solid argument in favor 
of the conjecture that X is a self-similar process with exponent Hi. 

Certainly, if we know the theoretical density f(x) (which, of course, depends 
on H), then instead of verifying (14) we should look if the graph of AÊ fa (sA) 


is close to that of fı (x) plotted for the Hurst exponent H (or for the true value of 
this parameter if it is known a priori). 


7. For a-stable Lévy processes the Hurst exponent H is equal to 1/a, therefore the 
estimation of H reduces to the estimation of œ. For a fractional Brownian motion 
By = (Bul(t))tex0 we can estimate H from the results of discrete observations, for 
instance, as follows. 

We consider the time interval [0, 1] and partition it into n equal parts of length 
A=1/n. Let 
SP- [Bu(kA) — Ba((k — DA)! 


n 


n(Br; A) = 


(cf. formula (19) in Chapter IV, § 3a). 
Since 


E|Bu(t + s) — Bu(t)| = jee 


2 
E1(By; A) = zat. 


it follows that 
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Hence we infer the natural conclusion: we should take the statistics 


a log[\/7/21 (Bu; A)] 
4 log A 


as an estimator for H. (It was shown in [380] that Ħin + H with probability one.) 


§2d. Fractional Gaussian Noise: a Process with Strong Aftereffect 


1. In many domains of applied probability theory a Brownian motion B = (Bz)ts0 
is regarded as an easy way to obtain white noise. 
Setting 
Bn = Bn- Bn-1, n >l, (1) 


we obtain a sequence 6 = (Pn)n>1 of independent identically distributed random 
Gaussian variables with EG, = 0 and E82 = 1. In accordance with Chapter II, § 2a, 
we call such a sequence white (Gaussian) noise; we have used it as a source of 
randomness in the construction of various random processes, both linear (MA, AR, 
ARMA, ...), and nonlinear (ARCH, GARCH,...). 

In a similar way, a fractional Brownian motion By is useful for the construc- 
tion of both stationary Gaussian sequences with strong aftereffect (systems with 
long memory, persistent systems), and sequences with intermittency, antipersis- 
tence (relaxation processes). 

By analogy with (1) we now set 


Bn = Bu(n) — Buln — 1), n>1, (2) 


and we shall call the sequence 3 = (Gn)n>1 fractional (Gaussian) noise with Hurst 
parameter H,O<H< 1. 

By formula (5) in § 2c for the covariance function of a (standard) process By we 
obtain that the covariance function py(n) = Cov( Gp, Bk+n) is as follows: 


1 
pu(n) = 5 (lm + 178 — 2[n7 + [rn — 11%}. (3) 


Hence 
pu(n) ~ H(2H — 1)|n|?"-? (4) 


as n — OO, 

Thus, if H = 1/2, then py(n) = 0 for n # 0, and (fp)n>1 is (as already 
mentioned) a Gaussian sequence of independent random variables. On the other 
hand if H Æ 1/2, then we see from (4) that the covariance decreases fairly slowly 
(as |n|- (2-2) with the increase of r, which is usually interpreted as a ‘long mem- 
ory’ or a ‘strong aftereffect’. 
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We should note a crucial difference between the cases of 0 < H < 1/2 and 
1/⁄2<H<1. 

If 0 < H < 1/2, then the covariance is negative: py(n) < 0 for n £ 0, moreover, 
(e0) 
2 leur) < 00. 


n= 


On the other hand, if 1/2 < H < 1, then the covariance is positive: pp(n) > 0 
oO 
for n #0, and X py(n) = œ. 
n=0 


A positive covariance means that positive (negative) values of 8p are usually 
followed also by positive (respectively, negative) values, so that fractional Gaussian 
noise with 1/2 < H < 1 can serve a suitable model in the description of the ‘cluster’ 
phenomena (Chapter IV, §3e), which one observes in practice, in the empirical 


n 


analysis of the returns hn = ln for many financial indexes § = (Sn). 


n-1 
On the other hand, a negative covariance means that positive (negative) values 
are usually followed by negative (respectively, positive) ones. Such strong intermit- 
tency (‘up and down and up ...’) is indeed revealed by the analysis of the behavior 
of volatilities (see §§ 3 and 4 in Chapter IV). 


2. The sequence 8 = (Gn) is a Gaussian stationary sequence with correlation func- 
tion py(n) defined by (3). 

By a direct verification one can see that the spectral density f(A) of the spectral 
representation 


Py(n) = | [ " eA” f(A) dd (5) 


can be expressed as follows: 
igs 2 T\  —2H-1 
f (cos x) (sin 3) x dx 
0 


(oe) 
f (sin? Z )eT2 dx 
0 2 


Calculating the corresponding integrals we obtain (see [418] for detail) that 


fa) = (6) 


1 


N=aK H ià = 2 wa pe BN 
fu( ) ( le 1| 5 |A + 2rk|2Ħ+1? 


k=- 


IA] < 7, (7) 


n —1 
where K (H) = (acon) 


3. The case H = 1/2. With an eye to the applications of fractional Brownian 
motions to the description of the dynamics of financial indexes, we now choose a 
unit of time n = 0, +1, +2,... and set 


Hn = Buln), in = Hn — Hn-1. 
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Clearly, Ehn = 0 and Dh? = 1. 
The corresponding Gaussian sequence h = (hn) of independent identically dis- 
tributed variables is, as already pointed out in this section, white (Gaussian) noise. 


4. The case of 1/2 < H < 1. The corresponding noise h = (hn) is often said to 
be ‘black’. It is featured by a strong aftereffect, long memory (a persistent system). 

Phenomena of this kind are encountered in the behavior of the levels of rivers, or 
the character of solar activity, the widths of consecutive annual rings of trees, and, 
finally (which is the most interesting from our present standpoint), in the values of 


the returns hn = ln 


n > 1, for stock prices, currency cross rates, and other 


d 
n—1 


financial indexes (see Chapter IV). 

If H = 1/2, then the standard deviation ,/D(hi +--+ hn) increases with n 
as y/n, while if H > 1/2, then the growth is faster, of order n™, To put it otherwise, 
the dispersion of the values of the resulting variable Hn = hy +--+++ hn is larger 
than for white noise (H = 1/2). 

It is instructive to note that if H = 1 then we may take By(t) = tBy(1) (P-as.) 
for a fractional Brownian motion. Hence the sequence h = (hn) of increments 
hn = By(n) — By(n— 1), n > 1, is trivial in this case: we always have hn = By(1), 
n > 1, which one could call the ‘perfect persistence’. 


5. The case of 0 < H < 1/2. Typical examples of systems with such values of 
the Hurst parameter H are provided by turbulence. The famous Kolmogorov’s Law 
of 2/3 ([276], 1941) says that in the case of an incompressible viscous fluid with very 
large Reynolds number the mean square of the difference of the velocities at two 
points lying at a distance r that is neither too large nor too small is proportional 
to r°4 where H = 1/3. 

Fractional noise h = (hn) with 0 < H < 1/2 (‘pink noise’) has a negative 
covariance, which, as already mentioned, corresponds to fast alternation of the 
values of the hn. This is also characteristic of turbulence phenomena, which (coupled 
with self-similarity) indicates that a fractional Brownian motion with 0 < H < 1/2 
can serve a fair model in a description of turbulence. 

An example of ‘financial turbulence’, with Hurst parameter 0 < H < 1/2, is 


provided by the sequence 7 = (Fn) with Fn = In =a , where 
On-1 
A2 1 ” ay V2 
on = 5 (hk E hn) 
n-1 at 


is the empirical variance (volatility) of the sequence of logarithmic returns h = (hn), 
S 

hn = n —"~, calculated for stock prices, DJIA, S&P500 Index, etc. (see Chap- 
n-1 

ter IV, § 3a). 
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Many authors (sec, e.g., [385], [386], and [180]) see big similarities between 
hydrodynamic turbulence and the behavior of prices on financial markets. This 
analogy brings, e.g., the authors of [186] to the conclusion that “in any case, we 
have reason to believe that the qualitative picture of turbulence that has developed 


during the past 70 years will help our understanding of the apparently remote field 
of financial markets”. 
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§ 3a. Brownian Motion and its Role of a Basic Process 


1. In a constructive definition of random sequences h = (An) describing the dy- 
n 
Sn-1 
at time m), one usually assumes, both in linear and nonlinear models, that we 
have some basic sequence € = (En), which is the ‘carrier’ of randomness and which 

generates h = (hn). Usually, € = (En) is assumed to be white (Gaussian) noise. 
The choice of such a sequence € = (En) as a basis reflects the natural wish 
for building ‘complex’ objects (such as, in general, the variables hn) from ‘simple’ 
bricks. 
The sequence £ = (€n) can indeed be considered ‘simple’, for it consists of inde- 
pendent identically distributed random variables with classical normal (Gaussian) 
distribution N (0, 1). 


namics of the ‘returns’ hn = In (corresponding, say, to some stock of price Sn 


2. In the continuous-time case a similar role in the construction of many models of 
‘complex’ structure is played by a Brownian motion, introduced as a mathematical 
concept for the first time by L. Bachelier ([12], 1900) and A. Einstein ((132], 1905). 
A rigorous mathematical theory of a Brownian motion, as well as the corresponding 
measure in the function space were constructed by N. Wiener ({476], 1923) and one 
calls them also a Wiener process, and the Wiener measure. 

By Definition 2 in § 1b, the standard Brownian motion B = (Bt)tz0 is a con- 
tinuous Gaussian random process with homogeneous independent increments such 
that Bo = 0, EB; = 0, and EB? = t. Its covariance function is EB, B; = min(s, t). 

We have already pointed out several times the self-similarity property of a 
Brownian motion: for each a > 0 we have 


Law(Bat:t 2 0) = Law (a!/? Bit > 0). 
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It follows from this property that the process (Ha Bat) o is also a Brownian 


motion. In addition, we mention several other transformations bringing about 
new processes BC) (i = 1,2,3,4) that are also Brownian motions: B® = —Bi; 
BO) = tB; j; fort > 0 with BY?) = 0; B®? = Bis- Bs for s > 0; BY) = Br- Bri 
for0<t<T,T>0. 

A multivariate process B = (B},..., BÎ) formed by d independent standard 
Brownian motions B* = (Bi)i50, i = 1,...,d, is called a d-dimensional standard 
Brownian motion. 

Endowed with a rich structure, Brownian motion can be useful in the construc- 
tion of various classes of random processes. 

For instance, Brownian motion plays the role of a ‘basic’ process in the con- 
struction of diffusion Markov processes X = (X:)>o as solutions of stochastic 
differential equations 


dX; = a(t, Xe) dt + a(t, Xt) dBi, (1) 


interpreted in the following (integral) sense: 


t t 
Xı = Xo + f a(s, Xs) ds + [ o(s, Xs) dBs (2) 
0 0 


for each t > 0. 
The integral 


n= | aR: (3) 


involved in this expression is treated as a stochastic Ité integral with respect to the 
Brownian motion. (We consider the issues of stochastic integration and stochastic 
differential equations below, in § 3c.) 

An important position in financial mathematics is occupied by a geometric 
Brownian motion S = (St)tzo satisfying the stochastic differential equation 


dS; = Si (a dt+o dBi) (4) 


with coefficients a € R and a > 0. 
Setting an initial value So independent of the Brownian motion B = (Bz)t>0 we 
can find the explicit solution 


2 
Zo 
St = So et e7 Bt- 7 t (5) 
of this equation, which can be also written as 
Si = Soe 


(cf. formula (1) in Chapter II, § la) with 


o? 


H; = (0- Frome. (6) 
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We call the process H = (Ht)z39 a Brownian motion with local drift (a — 07/2) 
and diffusion 07. We see from (6) that the local drift characterizes the average rate 
of change of H = (Hz)s50; the diffusion a? is often called the differential variance 
or (in the literature on finances) the volatility. 

Probably, it was P. Samuelson ([420], 1965) who understood for the first time the 
importance of a geometric Brownian motion in the description of price dynamics; 
he called it also economic Brownian motion. 

We present now other well-known examples of processes obtainable as solutions 
of stochastic differential equations (1) with suitably chosen coefficients a(t, £) and 
a(t, x). 

A Brownian bridge X = (X:)ogtsr with Xp = a and Xr = £ is a process 
governed by the equation 


dX = 


2X 
Ca dt + dB, 0<t<T, (7) 


where B = (B;)t>0 is a Brownian motion. 
Using, c.g., It6’s formula (see §3d below) one can verify that the process X = 


(Xvosesr with 
Xi=a(1- 7) +85 rae-of 2 pe (8) 


is a solution of (2) (here we treat the integral as a stochastic integral with respect to 
a Brownian motion). Since this equation is uniquely soluble (see § 3e), formula (8) 
defines a Brownian bridge issuing from the point a at time ¢ = 0 and arriving at 8 
for t=T. 

For a standard Brownian motion, its autocovariance function p(s, t) is equal to 
min(s,¢), and for a Brownian bridge it is p(s,¢) = min(s,t)— T The corresponding 

t 

expectation is EX; = o(1-4 ——}+ Ba. 

It is easy to verify that Ae each ae Brownian motion W = (Wi )t>0, the 
process W? = (wf Jogt<r defined by the formula 


t 
WP = W,- -Wr (9) 

T 

; t 
has the covariance function p(s, t) = min(s, D-5. Hence the process Y = (Yi JogtgT 
with 
Yı EAEE ee (10) 
=Q 
oe T TRA 


has the same finite-dimensional distributions as X = (Xt)ogtgr defined by (8) 
(Law (¥;,t < T) = Law( X,t < T)), and, thus, it can be regarded as a version of a 
Brownian bridge. 
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An Ornstein—Uhlenbeck process ([466], 1930) is a solution of the (linear) 
stochastic differential equation 


dX, = —aX; dt + o dBi, a> 0. (11) 


Again, using Itô’s formula we can verify that the process X = (X;)t>0 with 
t 
X, = Xpe7™ + o f e o's) dB, (12) 
0 


is the (unique; see § 3e) solution of (11). 
If the initial value Xo is independent of the Brownian motion B = (Bt)ty0 and 
has finite second moment, then 


EX; =e “EX, (13) 
2 2 
T T 2at 
DX; = — DXo — — Je 
t= >, + ( 0 L (14) 
2 ; 
Cov( Xs, X) = [ox + (e22 min(s;t) = | eelst), (15) 


If Xo has a normal (Gaussian) distribution with EXọ = 0 and DXọ = o?/(2a), 
then X = (Xz)tp0 is a stationary Gaussian process with expectation zero and 
covariance function 


2 
o = = 
a(s,t)= se alt= (16) 


It should be noted in connection with the Ornstein-Uhlenbeck equation (11) 
that it is a well-posed version of Langevin’s equation ([295], 1908) 


dV; B 
mt = pi to, (17) 
describing the evolution of the velocity V; of a particle of mass m put in a fluid and 
pushed ahead in the presence of friction (—GV;) by clashes with molecules described 
by a Brownian motion. 

Written as in (17), this equation has in general no sense if one understands 
derivatives in the usual sense, because (for almost all realizations) of a Brownian 
motion the derivative dB,/dé is nonexistent (see §3b.7 below). 

However, we can assign a precise meaning to this equation if we treat it as the 
Ornstein—Uhlenbeck equation 


WS E (18) 
m m 
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which has for Vo = 0 a solution that can be described, in accordance with (12), by 
the equality 


t 
Y= zf em(t-9) dB. (19) 
m JO 


The Bessel process of order a > 1 is, by definition, a process X = (X¢)t50, 
governed by the (nonlinear) stochastic differential equation 
a—1 dt 
dX; = ——— — +dB 20 
t=" ¥y t (20) 
with initial value Xọ = s > 0, where B = (Bt)tz0 is a Brownian motion. (This 
equation has a--unique—strong solution; see § 3e). 
If œ = d, where d = 2,3,..., then X can be realized [402] as the radial compo- 
nent R = (Rz)t>0 of a d-dimensional Brownian motion 


Be) = (z1 +B}... æa + BE) 5 


with independent standard Brownian motions B? = (Bi )ts0 and a? +o +23 = 27, 


i.e., 
Ri = \/ (£1 + B1)? +--+ (za + BY)?. (21) 


We discuss several other interesting processes governed by stochastic differen- 
tial equations in § 4, in connection with the construction of models describing the 
dynamics of bond prices P(t, T) (see Chapter I, § 1b). 


§ 3b. Brownian Motion: a Compendium of Classical Results 


1. Brownian motion as a limit of random walks. As described by many 
authors (see, for instance, [201; p. 254] or [266; p. 47]), circa 1827, a botanist 
R. Brown discovered that pollen particles put in a liquid make chaotic, irregular 
movements. (He described this phenomenon in his pamphlet “A Brief Account of 
Microscopical Observation ...” published in 1828.) 

These movements were called a Brownian motion. They, as became clear later, 
were brought on by clashes of liquid molecules with the immersed particles. The 
corresponding mathematical model of this physical phenomenon was built by A. Ein- 
stein ([132], 1905). However, it should be pointed out for fairness sake that even 
earlicr, in 1900, a similar model had been built by L. Bachelier [12] in connection 
with the description of the evolution of stock prices and other financial indexes on 
the Paris securities market. 

As pointed out in Chapter I (§1b), a Brownian motion arose in Bachelier’s 
analysis as a (formal) limit of simplest random walks. 

Namely, let (£k)k>1 be a sequence of independent identically distributed random 


variable assuming two values, +1, with probabilities 4 (the Bernoulli scheme). 
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We consider the half-axis R+ = (0,20), and for each A > 0 we construct the 
process $(4) = (Bin 


4 [t/a] 
SY aah 5 Vb, (1) 


k=1 


with piecewise constant trajectories. 
Starting fron: the processes S(4) we can also construct random processes 


gO = (Se iso with continuous trajectories by setting 


(A) _ ofA), 1 (A) (A) 
Si = Ska + X(t KA) (Ste tiya = Sia) (2) 
By the multidimensional central limit theorem (see, e.g., [51; Chapter 8] or [439; 
Chapter VIL. §8]) we can conclude that for all t1,...,¢,, k > 1, the finite-dimensio- 


nal distributious Law(5{), kei s and Law (5{®, aai a) converge (weakly) 
to the finite-dimensional distributions Law(B,,...,Be,), where B = (Be)iz0 is a 
standard Brownian motion. 


We can actually say even more; namely, 
(A) 
Law(S;"’, t > 0) > Law(B;,t > 0) 


and 
Law (S(, t > 0) > Law(B;,t > 0) 


in the sense of weak convergence of distributions in the (Skorokhod) spaces D 
(of right-coutinuous functions having limits from the left) and the space C (of 
continuous fmictions); sce, e.g., [39] and [250] for greater detail. 


2. Brownian motion as a Markov process. Let (Q, F,P) be some fixed prob- 
ability space. Let B = (B:(w))i50 be a Brownian motion defined on this space. 
By F? = o(By,s < t) we shall mean the o-algebra of events generated by the 
variables Bs, s < t; let 
Fi = (F? (3) 


s>t 


be the a-algebra of events observable not only on the interval [0, t], but also ‘in the 
infinitesimal future’ relative to the instant t. 
We note that, by contrast to (FP)i>0, the family (Fy )e>0 has an important 
property of right continuity: 
(Ft = F}. (4) 
s>t 
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Still, the difference between the o-algebras F? and ¥;* is not that essential (in 
the following sense). Let M = {A € F: P(A) = 0} be the set of zero-probability 
events in F. Then the o-algebra o(¥; U M) generated by the events in FY and 
AN coincides with the o-algebra o( F? U M) generated by the events in F? and M: 


o(F; UN) =o(FPUY). (5) 


This is an argument in favor of the introduction of another family of o-algebras 
(Feyeso with Fy = a(F? UN)= a(Fir U M) such that each c-algebra is clearly 
completed with all sets of probability zero, and the whole family is right-continuous, 
ie, Fe = [lss Fs. (One says that the corresponding basis (Q, F, (F¢)e>0, P) 
satisfies usual conditions; see subsection 3 below.) 

Assume that T > 0 and let B(T) = (Bi(T; w))e0 be the process constructed 
from the Brownian motion B = (Br(w)):50 by the formula 


Bı (T; w) = Bir rlo) = Br(w). 


We already mentioned in § 3a.2 that 1) B(T} is also a Brownian motion. More- 
over, it is easy to show that 2) the c-algebras FL, = o(Bs, s < T) and F(T) = 
o(Bs(T),s > 0) are independent. (As usual, we prove first that the events in the 
corresponding cylindrical algebras are independent and then use the ‘monotonic 
classes method’; see, e.g., [439; Chapter II, § 2].) 

It is the combination of these properties that one often calls the Markov property 
of the Brownian motion (see, e.g., [288; Chapter II]) inferring from it, say, the 
standard Markov property of the independence of the ‘future’ and the ‘past’ for 
any fixed ‘present’. Namely, if f = f(x) is a bounded Borel function and o(Br) is 
the o-algebra generated by Br, then for each t > 0 we have 


E(f(Br+t)| Fr) = E(f(Br+t)|o(Br)) — (P-as.). (6) 

This analytic form of the Markov property leaves place for various generaliza- 
tions. For instance, one can consider the o-algebra Fr in place of FL, and bounded 
trajectory functionals f(Br+1,t > 0) in place of f(Br4). See, e.g., [123] and [126] 
for greater detail. 

The following generalization, which brings one to the strong Markov property, 
relates to the extension of the above Markov properties to the case when, in place 
of (deterministic) time T, one considers random Markov times r = r(w). 

To this end we assume that r = 7(w) is a finite Markov time (relative to the 
flow (Ft)e30)- ie d 43 

By analogy with B(T} we now consider the process B(T) = (Bi(r(w);w))es0, 
where J 

Bi(r(w);w) = Br+7(w)(¥) E Briw) (w): (7) 
The most elementary version of the strong Markov property requires that the pro- 
cess B(r) be also a Brownian motion and the o-algebras F, (see Definition 2 in 
Chapter II, § 1b) and F2 (r) = o(F9,(r) U WN) be independent. 
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The analytic property (6) has the following, perfectly natural, generalization: 
E(f(Brw)4ew)) | Fr) = E(F(Br(wy4e(w)) |7(Br)) — (P-a.s.). (8) 
(As regards generalizations of this property from the case of a function f to func- 
tionals, see, e.g., the books [123], [126], and [288].) 


3. Brownian motion and square integrable martingales. The following pro- 
perties follow immediately from the definition of the Brownian motion B = (Bz)z>0: 
fort >0 


B; is ¥4-measurable, (9) 
E| B| < œ, (10) 
E(Bi| Fs) = Bs (P-as.) for s<t. (11) 


These three properties are precisely the ones from the definition of a martingale 
B = (Bz)t>o0 with respect to the flow of o-algebras (F+) and the probability mea- 
sure P (cf. Definition 2 in Chapter II, § 1c). 
Further, since 
E(B? - B? 


Fs) =t—s, (12) 


the process (B? —t)t>0 is also a martingale. 

Now let B = (Bt)ty0 be some process satisfying (9)-(12). This is a remark- 
able fact that, in effect, these properties unambiguously specify the probabilistic 
structure of this process. 

Namely, let (Q, F, (Ft)tz0, P) be a filtered probability space with flow of o-al- 
gebras (¥;)1>0 satisfying the usual conditions ([250; Chapter I, § 1]) of right-con- 
tinuity and completeness with respect to the measure P. (We point out that the Fi 
here are not necessarily the same as the earlier introduced algebra o (FF U-Y).) 

Each process B = (Bt)t50 with properties (9)-(11) is called a martingale. To 
emphasize the property of measurability with respect to the flow (Ft)t>0 and the 
measure P. one often writes B = (Bi, Ft) or B = (Br, Fe, P). (Cf. definitions in 
Chapter IT, § 1c for the discrete-time case.) 


THEOREM (P. Lévy, [298]). Let B = (Bt, Ft)t>0 be a continuous square inte- 
grable martingale defined on a filtered probability space (Q, F, (Ft)tz0, P). Assume 
that (12) holds, i.e., (B? — t, ¥)e>0 is also a martingale. Then B = (Bi)ip0 is a 
standard Brownian motion, 

See the proof in § 5« below. 
4. Wald’s identities. Convergence and optional stopping theorems for 
uniformly integrable martingales. For a Brownian motion we have 


EB,=0 and EB? =t. 


In many problems of stochastic calculus it is often necessary to find EB; and EB? 
for Markov times r (with respect to the flow (Fthz0) 
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The following relations are extended versions of Wald’s identities for B = 
(Bi, Fiyeso: 
E/r < œ => EB, =0, 
Er < œ => EB? =Er. 
In the particular case when r is a bounded Markov time (P(r < c) = 1 for some 


constant c > 0), the equalities EB, = 0 and EB? = Er are immediate consequences 
of the following result. 


THEOREM (J. Doob, {109]}. Let X = (Xi, Ft)tzo be a uniformly integrable martin- 
gale (i.e, a martingale such that sup E(|X¢|I(|Xz| > N)) > 0 as N > oo). Then 


1) there exists an integrable random variable Xə such that 


Xt > Xæ (P-as.), 
E| X: = Xoo| +0 
as t > oo and 
E(X% | Ft) = Xt (P-a.s.) 
for allt > 0 
2) for all Markov times ø and r we have 
Xtrg = E(Xo | Fr) (P-a.s.), 

where TAa = min(T,¢). 


(Cf. Doob’s convergence and optional stopping theorems in the discrete-time case 
in Chapter V, § 3a.) 


5. Stochastic exponential. In Chapter II, § 1a we gave the definition of a sto- 
chastic exponential elfy for processes H = (A)e>0 that are semimartingales. 

In the case of Ay = = ÀB; this stochastic exponential &(AB); can be defined by 
the equality 


2 
EAB) = eB Ft, (13) 


It immediately follows from Ité’s formula (§ 3d) that X = &(AB); satisfies the 
stochastic differential equation 


dX; = Xi dB (14) 


with initial condition Xg = 1. 
If € has the distribution (0, 1), then 


2 
Eeò- F= t 
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By this property and the self-similarity of a Brownian motion, meaning that 


Law(AB;) = Law(à Vt B1), 


we obtain 
2 Avt 2 t 2 
Eexp{ MB = xt = Eep{aB, — ( 2 } = Eexp{ (Avie 5 QA } —] 
In a similar way we can show that for all s < ¢ 
E(E(AB)t| Fs) = E(AB}s (P-a.s.). (15) 


That is, the stochastic exponent &(AB) = (€(AB)z)e50 is a martingale. 


6. Constructions of a Brownian motion. Let € = (€x)g59 be Gaussian white 
noise, i.e., a sequence of independent normally (N (0, 1)) distributed random vari- 
ables. Let Hy, = H,(t), k > 0, be the Haar functions (see, e.g., [439; Chap- 
ter II, §11]) defined on the time interval [0,1], and let 


t 
sett) = | Hy(s)ds 


be the Schauder functions. 
We now sct 


n 
BO” = D EkSklt) 
k=0 


It follows from the results of P. Lévy [298] and Z. Ciesielski [76] that the BO”) 
converge (P-as.) uniformly in t € [0,1] and their P-a.s. continuous limit Is a 
standard Brownian motion 

An earlier construction of R. Paley and N. Wiener [374] (1934) is the (uniformly 
convergent) series 


7. Local properties of the trajectories. The following results are well known 
and their proofs can be found in many monographs and textbooks (e.g., [124], [245], 
[266], [470]). 

With probability one, the trajectories of a Brownian motion 


oc 2”—1 


in krt 
Bst D Via 


n=1 ‘k=2r-1 


a) satisfy the Holder condition 
|B, — Bs| < clt — s|” 


for each y < i; 

b) do not satisfy the Lipschitz condition and therefore are not differentiable at 
any t > 0; 

c) have unbounded variation on each interval (a, b): Ve p785] = 00. 
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8. The zeros of the trajectories of a Brownian motion. Let (B;(w))is0 be 
a trajectory of a Brownian motion corresponding to an elementary outcome w € Q, 
and let 

Mw) = {0 <t < co: B(w) = 0} 


be its zero set. 
Then we have the following results ([124], [245], [266], [470]): P-almost surely, 
a) Lebesgue measure A(It(w)) is zero; 
b) the point t = 0 is a condensation point of the zeros; 


c) there are no isolated zeros in (0, 00), so that N(w) is dense in itself; 
d) the set N(w) is closed and unbounded. 


9. Behavior at the origin. The Local law of the iterated logarithm states that 


(P-a.s.) 
TN |Bel 


lim ————— 
t40 \/2¢1n |In¢| 


By this property, as applied to the Brownian motions (Bri — Bt)jtz0, we obtain 
that (P-a.s.) 


m |Brtn — Bel 
lim ——-——— = œ 
h40 vh 


for each ¢ > 0, i.e., the Lipschitz condition, as already pointed out, fails on Brownian 
trajectories. 


10. The modulus of continuity is a geometrically transparent measure for the 
oscillations of functions, trajectories, and so on. A well-known result of P. Lévy [298] 
about the modulus of continuity of the trajectories of a Brownian motion states that, 
with probability one, 


max |B: — Bs| 
Tin O<s<t<l,t—s<ch 


h10 Jah in(1/h) 


11. Behavior as t > co. With probability one, 


B: 
p >0 as to, 
(Strong law of large numbers). 
Moreover, 
2 >0 t> (P-a.s.) 
as oo -a.s.), 
vilnt 
but 
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The precise asymptotic behavior of the trajectories of a Brownian motion as 
t — oo can be described by the Law of the iterated logarithm. Namely, 


ao (BE 
lim ————— =1 (P-as.). 16 
too y2tlnlnt ( ) (16) 


12. Quadratic variation. Although the trajectories of a Brownian motion 
have (P-a.s.) unbounded variation, i.e., I »)|4Bs| = œ, we can assert that 
a, 
I pl ¢Bsl” = b-— a in a certain sense. 
a, 
The corresponding result, which plays a key role in many issues of stochastic 
calculus (e.g., in the proof of It’s formula; § 3d), can be stated as follows. 


Let T™) = ce, as i) be a partitioning of the interval [a,b] such that 
aat < ct =h, 


Let 
T= sup 1 a], 17 
IT ]] i nil ktr eg | (17) 


_ 


Then we have: 


a) if ||| > 0 as n > œ, then 


kn—1 2p 
Bin —Bim| >b- a; (18) 
y(n) T B 
k=0 k+1 k 


oo 
b) if > rr || < oo, then we have convergence in (18) with probability one; 
n=1 
c) if B® and B®) are two independent Brownian motions and ||T™®]| — 0 as 
n —> oo, then 


kn—1 
G) (1) (2) (2) \ P 
B B Brey ~By) 7 90. 19 
2 ( Sng um) ( thei i ) ve) 


In a symbolic form, relations (18) and (19) are often written as follows: 


(dB)? =dt and dBOaB® =0. (20) 


248 Chapter III. Stochastic Models. Continuous Time 


13. Passage times of levels. a) Assumie that a > 0 and let Ta = inf{t > 0: 
Bı =a}. Then it is clear that 


Pilg <th= P (sup Bs > a). (21) 
s<t 


Using the reflection principle of D. André we see that 


P(Ta < t) = 2P(B; 2 a). (22) 
(See, e.g., [124], [266], and [439].) 
Since 
1 x r? 
P(B > a) = a | e 2 dz, (23) 
T a 
it follows that 
ts a2 
P(Ta < t) = f —====e 2 ds, (24) 
0 V27s3 
OP(T, <t 
therefore the distribution density pa(t) = PPa <t) is defined by the formula 
a a? 
Palt) = e 2, (25) 


V 2rt3 


Hence, in particular, P(Ta < oo) = 1, ETa = œ, and the corresponding Laplace 


transform is 
Ee Ta = e-a V2), (26) 


It should be noted that T = (Ta)azo is a process with independent homogeneous 
increments (by the strong Markov property of a Brownian motion). Moreover, this 
is a stable process with parameter a = 4, i.e., Law(Ta) = Law(a?7}) (cf. § 1a.5). 

b) Assume that a > 0 and let Sa = inf{t > 0: |B| = a}. We claim that 
ES, = a” and 

1 
Ee*Sa — —______, 27 
cosh(aV 2A) (27) 

By Wald’s identity, EBS at = E(Sq At) for each t > 0, therefore E(Sa At) < a”. 

Hence ESg = jim E(Sq At) < a? by the monotone convergence theorem. (Of 
700 


course, it follows from this that Sa < oo with probability one.) On the other hand, 


ES, < oo, therefore, using Wald’s identity again we obtain that EBS. = ESg. 


Bearing in mind that Be = a we see that ES, = a?. Bearing in mind that 


By = a we sce therefore that ES, = a”. 


To prove (27) we now consider the martingale x(@) = (Xtas,, Ft), where 


2 
XtASa = exp| Bens, = a A Sa) }. (28) 
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Since |Bias,| < a, this is a uniformly integrable martingale and, by Doob’s 
theorem in subsection 4, 
EXias, =. (29) 


By the theorem on dominated convergence, we can pass to the limit as t > oo 
in this equality to obtain 
EXs, =l, 


ie., 
2 
Eexp{ ABs, ~ =a} =1, (30) 
Since P(Sa < co) = 1 and since P(Bs, = a) = P(Bs, = —a) = 5 by symmetry 


reasons, the required equality (27) follows by (30). 
c) Let Ty, = inf{t: By = a + bt}, a > 0. If b < 0, then P(Tap < œ) = 1. Using 


g2 
the fact that the process (ef Pi- T+) o isa martingale, and setting 0 = b+ vb? +2 


we can find the Laplace transform 


Ee *7o,6 — exp{—a[b + Vb? + 2A] }. (31) 


By this formula or directly from Wald’s identity 0 = EBr, , (= a + bETa, b) we 
obtain 


t20 


a 
5. 
(Using the trick described after formula (27) we can prove that ETa b < œ.) 


ET, b =- 


02 
If b > 0, then we consider the martingale (irr with 6 = 2b to obtain 


2b)*t 2b)? 
exp 25B; st \ < exp 24(a + bt) — ord cee. 


so that this martingale is uniformly integrable. Hence 


g2 
l= Eexp{ 621, — Fluo} 


62 
= Eexp{ 081, - T Tap ba < œ) = P (Tab < oc)je?, 


therefore if a > 0 and b > 0 (or a < 0 and b < 0), then 


P(Ta b < 00) = e7’, (32) 
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14. Maximal inequalities. Let B = (Bt)ty9 be a Brownian motion. 


\>0, p > 1, and a finite Markov time T we have 


E|Br|P 
P( Bil > A) greets 
AR el Sp 
and 
p \’ 
E|Br|? < Emax |B,|P < (2) E|Br|? 
t<T p-1 
for p > 1. 


In particular, if p = 2, then 


P( |B >A) < EB} 
ma = yo >? 
eT | 2 ) A2 


E max B? < 4EB?.. 
t<T 


D 


Then for 


(33) 


(34) 


(35) 


(36) 


Inequalities (33) and (35) are called the Kolmogorov-Doob inequalities, while (34) 
and (36) are called Doob’s inequalities (see, e.g., [109], [110], [124], [303], [304], 


or [402]). 
By (36) we obtain 
E max |B,| < 2VEB?. 
t<T 
If ET < co, then EB?, = ET, so that 


E max | By} < 2VET. 
t<T 
As shown in [116], one can refine inequality (38); actually, 


Emax |Bi| < v2 VET, 


where the constant v2 is best possible. 
Setting T = 1 in (39) we obtain 


E max |B| < V2. 
t<1 


(37) 


(38) 


(39) 


(40) 


Of course, it would be interesting to find the precise value of E max |B|. 
t<1 


The following argument shows that 


E max |B| = En 
t<1 2 


(41) 


3. Models Based on a Brownian Motion 251 


By the self-similarity of a Brownian motion, with Sı = inf{t > 0: |B:| = 1}, we 
find 
1 
{sup |B: <a} = {sup =|B: < 1} = {sup |By,2l < 1} 
t<1 <1 v t<1 
1 1 
= { sup |By| < y = {Si 2 z} = {= <a}. 
t<i/x? T Sy 


Hence Law (sup [Bil) = Law (=). 


Further, since 
9 [ee] = r2 
o= F e 20? dx 
m Jo 


for each g > 0 by the properties of the normal integral, setting o = 
from (27) that 


= aS oo dx 
Ere Bil = Ezy rh Ee dr = A 
cosh z 
NE “ae E 
sadeg sage he era al 
E ave : i G 


which proves the required relation (41). 


a|- 


$ 3c. Stochastic Integration with respect to a Brownian Motion 


1. Classical analysis has at its disposal various approaches to the ‘operation of 
integration’, which bring about such (generally speaking, different) concepts as 
Riemann, Lebesgue, Riemann-Stieltjes, Lebesgue-Stieltjes, Denjoy, and other inte- 
grals (see A. N. Kolmogorov’s paper “A study of the concept of integral” in [277]. 

In stochastic analysis one also considers various approaches to integration of 
random functions with respect to stochastic processes, stochastic measures, and so 
on, which brings about various construction of ‘stochastic integrals’. 

Apparently, N. Wiener was the first to define the stochastic integral 


uf) = f(s) dBs (1) 


/(0,t] 


for smooth deterministic functions f = f(s), s > 0, using the idea of ‘integration 
by parts’ (see [375] and [476].) 
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Namely, starting from the ‘natural’ formula d(f B} = fdB + Bdf, one sets by 
definition 


t 
LA = IOB- fF s)Beds, (2) 


t ` i A x 
where the integral $ f'(s}Bsds is treated as the trajectory-wise (i.e., for each 


w € Q) Riemann integral of the continuous functions f’(s)Bs(w), s > 0. 


2. In 1944, K. Itô [244] made a significant step forward in the extension of the 
concept of a ‘stochastic integral’ and laid on this way the foundations of modern 
stochastic calculus, a powerful and efficient tool of the investigation of stochastic 
processes. 

It6’s construction is as follows. 

Let (Q, F, (Fr)es0, P) be a filtered probability space satisfying the usual condi- 
tions (see § 3b.2 and details in [250]). Let B = (By, Ft)tz0 be a standard Brownian 
motion and let f = (f(¢,))ts0,we be a random function that is measurable with 
respect to (t, w) and nonanticipating (independent of the ‘future’), i.e., 


f(t,w) is Ft-measurable 


for each t > 0. 

Such functions f = f(t, w) are also said to be adapted (to the family of o-algeb- 
ras (Ftjtz0)- 

Examples are provided by elementary functions 


f(t,w) = Y (wlio) (t), (3) 


where Y(w) is a ¥o-measurable random variable. 
Another example is the function 


f(t,w) = Y (w)I p51 (t) (4) 


(also said to be elementary), where 0 < r < s and Y (w) is a ¥,-measurable random 
variable. 

For functions of type (3) ‘concentrated’ at t = 0 (with respect to the time 
variable) the ‘natural’ value of the stochastic integral 


i(f) = i f(s,w)dB, 


is zero. If f is a function of type (4), then the ‘natural’ value of (f) is 
Y (w)[Bsat — Brat]. 
For a simple function 


f(t, w) =x Yo(w)L,0} (t) ale Yil Hiri, 8,] (4) (5) 
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that is a linear combination of elementary functions we set by definition 
hf) = / pg Oo) Bs = SIC) (Ban — Brae) (6) 
it P 


Remark 1. We point out that one need not assume that B = (Bt)t>0 is a Brownian 
motion in the definition of such integrals of elementary functions. Any process 
can play this role. The peculiarities of a Brownian motion, which is the object of 
our current interests, become essential if one wants to define a ‘stochastic integral’ 
with simple properties for a broader store of functions f = f(t,w), not merely for 
elementary ones and their linear combinations, simple functions. 


; t > ; 
Let us now agree to treat integrals A (usnal or stochastic) as integrals i 
over the set (0, t]. Since the Y;(w) are ¥,,-measurable, it follows that 


E[¥i(&)(Br,at z Bs,at)| =E E[Yi(w) (Brat zi Bs,nt) | Fr] 
= E[Y;(w) E((Br,at ~ Bent) | ¥r,;)] = 0. 


In a similar way, 
E[Yi(w) (Brat — Bs:at)|? = EY? (ri At — si At). 


Hence if f = f(t,w) is a simple function, then 


t 
Ef f(s,w) dB, = 0 (7) 


e( i d an.) =e f "Pordi: (8) 


In a more compact, form, 


and 


EL(f) = 0, (7°) 


t 
EI?(f) = E f f?(s,w) ds. (8') 


3. We shall now describe the classes of functions f = f(t,w) to which one can 
extend ‘stochastic integration’ preserving ‘natural’ properties (e.g., (7) and (8)). 
We shall assume that all the functions f = f(t,w) under consideration are 
defined on Ri x Q and are nonanticipating. 
If 


e( f?(s,w}ds < o) =1 (9) 


for cach t > 0, then we say that f belongs to the class J4. 
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If, in addition, 
t 
Ef f?(s,w}ds < œ (10) 
0 


for allt > 0, then f is said to belong to J2. 
It is for these function classes that Itô [244] gave his ‘natural’ definition of a 
stochastic integral [;(f) based on the following observations. 
Since the trajectories of a Brownian motion (P-a.s.) are of unbounded variation 
t 
(see § 3b.7), the integral (f) = J f(s,w) dB, cannot be defined as the trajectory- 


wise Lebesgue-Stieltjes integral. Itô’s idea was to define it as the limit (in a suitable 
probabilistic sense) of integrals I+( fn) of simple functions fn, n > 1, approximating 
the integrand f. 

It turned out (see [244] or, for greater detail, [303; Chapter 4]}) that if f € Jo, 
then there exists a sequence of simple functions fn = fn(t,w) such that 


t 
E [ 16s.) = falso) ds > 0 (11) 
0 


for each t > 0. Hence if f € J2, then 


t 
Ef [fn(s,w) — f(s, w)? ds > 0 (12) 
0 


as m,n — oOo. 
By (8) we obtain the isometry relation 


t 
Eiln) — lfm)? =E [fn(s,w) — fm(8s, w)? ds. (13) 


Taken together with (12), it shows that the random variables {J:(fn)}n>1 form a 
Cauchy sequence in the sense of convergence in mean square (i.e., in L?\, 

Hence, by the Cauchy criterion for L?-convergence (see, e.g., [439; Chapter II, 
§10]) there exists a random variable in L?, which we denote by I: (f), such that 


Llf) = l.i.m. Ti(fn), 


i.e., 
E[he(f) — lfa)? >0 as noo. 


This limit, (f), which is easily seen to be independent of the choice of an ap- 


t 
proximating sequence (fn)n>1, is also denoted by Í f(s,w) dB, and is called the 


stochastic integral of the (nonanticipating) function f = f(s,w) with respect to the 
Brownian motion B = (Bs)s50 over the time interval (0, t]. 
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We now discuss the properties of the integrals I(f} (t > 0) so defined for 
functions f € J2; the details can be found, e.g., in [123], [250], [288], or [303]. 
(a) If f,g € Jo and a and b are constants, then 


Ii(af + 69) = ali(f) + bh(g). 


(b) One can choose modifications of the variables (f), t > 0, coordinated so 
that I(f) = (k(f))es0 with Jo(f) = 0 is a continuous stochastic process (this is 
precisely the modification we shall consider throughout). Moreover, 


Ts(f) = LCF Ios) s<t. (14) 
(c) If r = r(w) is a Markov time such that r(w) < T, then 
I,(f) = Ir(fl(o,7); (15) 


where, by definition, I-(f) = I,()(/)- 
(d) The process I(f) = (e(f))eso is a square integrable martingale, i.e., 
I,(f) are ¥z-measurable, t > 0; 
EIZ(f\)< co, t>0; 
E(f) | Fs) = If) 


In addition, if f, g € Jo, then 


t 
En Aha) =E | Fls, w)gls w) ds (16) 


Remark 2. By analogy with the notation used for discrete time (see Definition 7 in 
Chapter II, § 1c), one often denotes I; (f} also by (f By (cf. [250; Chapter I, § 4d]). 

We now proceed to the definition of stochastic integrals I;(f) for functions in 
the class J}; we refer to [303; Chapter 4, § 4] for more detail. 


t 
Let f € Jj, ie., let PU, f2(s,w) ds < 00) = 1,t > 0. Then there exists a 


sequence of functions f™) € Jo, m > 1, such that 
t 
f [f(s,w) - f(s, wy]? ds Fo 
0 


for each £ > 0 (the symbol Po indicates convergence in probability). 
Since 


t 
PÒIG) - RF)| > 6} < + PL [FO (s,w) — f(s, w)]? ds > e) 
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for € > 0 and 6 > 0, letting first m,n — œ and then e | 0 we obtain 
i (mh _ (n) = 
aia PUR =u A= 


for each 6 > 0. Hence the sequence Ilf), m > 1, is fundamental in probability, 
and by the corresponding Cauchy criterion ([439; Chapter II, §10]) there exists a 
random variable I(f} such that 


LE 5 UA as m> oœ. 


The variable I;(f} (denoted also by (f - Bẹ, fa gi (8) dBs, or ftw) dBs) 


is called the stochastic integral of f over (0, ¢]. 

We now mention several properties of the stochastic integrals (f), t > 0, for 
fern. 

It can be shown that, again, we can define the stochastic integrals (f), in a 
coordinated manner for different t > 0 so that the process I(f) = (L(f))tz0 has 
(P-a.s.) continuous trajectories. 

The above-mentioned properties (a), (b), (c) holding in the case of f € Jz also 
persist for f in Jı. However, (d) does not hold any more in general. It can be 
replaced now by the following property: 

(d’) for f € Jı the process I(f) = (t(f))ts0 is a local martingale, i.e., there 
exists a sequence (Tn )n>1 Of Markov times such that mn f co as n > œ and the 
‘stopped’ processes 


I™*(f) = (Leama (Pizo 
are martingales for each n > 1 (ef. Definition 4 in Chapter II, § 1c.) 


4. Let B = (By)t50 be a Brownian motion defined on a probability space (Q, F, P) 
and let (¥t)t50 be the family of o-algebras generated by this process (see § 3b.2; 
for more precision we shall also denote Fi by FP, t > 0). 

The following theorem, based on the concept of a stochastic integral, describes 
the structure of Brownian functionals. 


THEOREM 1. Let X = X (w) be a FË -measurable random variable. 


1. If EX? < œ, then there exists a stochastic process f = (Felo), FP cr such 
that 


ya 
Ef fèlo) dt < œ (17) 
and (P-a.s.) 


X=EX+ T filv) dBi. (18) 
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2. If EIX| < oo, then the representation (18) holds for some process f = 
(Filo), FPcr such that 


p( | wat < o0) si (19) 


3. Let X = X (w) be a positive random variable (P(X >0) = 1) such that EX < œ. 
T 
Then there exists a process p = (vi(w), FP yer with P(f, yp? (w) dt < 00) =1 
such that (P-a.s.) 


X= EX exp] | eu(u)aBh = cf olodi). (20) 


From this theorem we derive the following result on the structure of Brownian 
martingales. 


THEOREM 2. 1. Let M = (Mi, FP tet be a square integrable martingale. Then 
there exists a process f = (fi(w), FP er satisfying (17) such that 


t 
Mi = Mo +f fs(w) dBs. (21) 


2. Let M = (Mi, FP rer be a local martingale. Then the representation (21) 
holds for some process f = (ft(w), FP rer satisfying (19). 
3. Let M = (Mi, FP ter be a positive local martingale. Then there exists a 


T 
process y = lylo), FPer such that BO p? (w) dt < co) = 1 and 


t 1 ft 
Mı = Mo exp} | Ys(w) dBs ~ sf p(w) ds}. 
0 


The proofs of these theorems, which are mainly due to J. M. C. Clark [77], but 
were also in different versions proved by K. It6 [244] and J. Doob [109], [110], can 
be found in many books; see, e.g., [266], [303], or [402]. 


§ 3d. Itô Processes and It6’s Formula 


1. The above definition of a stochastic integral plays a key role in distinguishing 
the following important class of stochastic processes. 

We shall say that a stochastic process X = (X¢)¢50 defined on a filtered probabil- 
ity space (Q, F, (Ft)t50, P) satisfying the usual conditions (see §3b.3), is an It6 pro- 
cess if there exist two nonanticipating processes a = (a(t, w))i59 and b= (b(t, w))e>0 
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such that 
t 
e(/ a(s,)| ds < oo) = 1. t>0, (1) 
0 
t 
e(/ (sw) ds < 00) =1, t>0, (2) 
0 
and 
t t 
Xı = Xo+ | a(s,w)ds+ f b(s,w)dBs, (3) 
0 0 


where B = (Bi, Ftjtzo is a Brownian motion and Xo is a Fo-measurable random 
variable. 

For brevity, one uses in the discussion of Itô processes the following (formal) 
differential notation in place of the integral notation (3): 


dX, = a(t,w) dt + b(t, w) dB; (4) 
here one says that the process X = (Xz)t>0 has the stochastic differential (4). 


2. Now let F(t,x) be a function from the class Cl? defined in Ry x R (ie. F is 


82 
a function with continuous derivatives T -= an =} and let X = (X¢)ts0 
be a process with differential (4). 
Under these assumptions, as proved by It6, the process F = (F(t, Xz))zp0 also 


has a stochastic differential and 


OF 
Ox 


OF OF 1 Ə? F 
dF (t, Xi) = É +a(t,w) ae 5h (tw) z] dt + ——b(t,w) dB. (5) 


More precisely, for each ¢ > 0 we have the following It6’ s formula (formula of 
the change of variables) for F(t, Xz): 


F(t, Xt) = F(0, Xo) 
‘lar OF 1 a°F t aF 
— L + b? — — ; 
+f E + a(s,w) Ae + 5 (s,w) =a as+ [ Aa b(s,w)dBs. (6) 


(The proof can be found in many places; see, e.g., [123], [250; Chapter I, § 4e], 
or [303; Chapter 4, §3].) 


3. We also present a generalization of (6) to several dimensions. 
We assume that B = (B!,..., B?) is a d-dimensional Brownian motion with 
independent (one-dimensional) Brownian components B’ = (Bf hzo, i= 1,--.,d. 
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We say that the process X = (X1,..., X”) with X? = (XPeso isa d-dimensional 
Ité process if there exists a vector a = (a!,...,a%) and a matrix b = ||" || of order 
dx d with nonanticipating components a’ = a’(t,w) and entries 6? = b% (t,w) such 


that 
et 
(| |a’(s.w)|ds < o) =; 
0 


for ¢ > 0 and 
dX} = a (t,w)dt + Y O(t, w) dB} (7) 
j=l 


for i= 1,...,d, or, in the vector notation, 
dX; = a(t, w) dt + b(t, w) dB, 


where a(t,w) = (a*(t,w),..., a4(t,w)) and By = (B},..., Be) are column vectors. 
Now let F(t, 2 1,...,%q) be a continuous funetion with continuous derivatives 

OF OF OO Ss Aa J 

Ot On an ETT Hp a N di 
Then we have the following d-dimensional version of Itô’s formula: 


F(t, X1, ..., X$) = F(0, X4, ..., XB) 


‘Ta a 
[Zex XH +) —( a , XJ) a"(s,w) 
ton 
i d 
= ien D C b pik 
+3. (sean %h ADS Hu) (s,w)) | a 
: OF 9 ; 
+f 5 gg (S Xa o X9) (8, w) aB}. (8) 
ij=1 `? 


For continuous processes X* = (XPes0 we shall use the notation (X? XI) z= 


(X$, XI) where 


. . d t - . 
(xi, x3), = za bt (s wb le wyds; 
k=1°9 
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Then (8) ean be rewritten in the following compact form: 


t 
F(t, X¢) = F(0, Xo) + f PE s, Xs) ds 


2 
fF -(8, Xs} d X$ + He (s, Xs) d(XŻ, XÍ}. 


Using differentials, this can be rewritten again, as 
d OF 


dF = — d Laki d(X*, XI 
E +z Ga; tta 24 Drda )i 
i,j=1 


(here F = F(t, Xò) 
It is worth noting that if we formally write (using Taylor’s formula) 


T OF yi ays 
dF asya ~ aX} + pe 03; dX} dX}, 
and agree that 
(dBj)? = dt, 
dB} dt = 0, 


dBidBi =0, i Fj, 
then we obtain (10) from (11) because 


dX} dX? = d(x*, XIJ. 


(9) 


(10) 


(11) 


(12) 
(13) 
(14) 


(15) 


Remark 1. The formal expressions (15) and (14) can be interpreted quite mean- 
ingfully, as symbolically written limit relations (2) and (3) from the preceding sec- 


tion, §3b. We can also interpret relation (13) in a similar way. 


For the proof of It6’s formula in several dimensions see, e.g., the monograph 
[250; Chapter I, § 4e]. As regards generalizations of this formula to functions F ¢ 


Cl. see, e.g., [166] and [402]. 


4. We now present several examples based on It6’s formula. 


EXAMPLE 1. Let F(z) = x? and let X; = Bi. Then, by (11), we formally obtain 


dB? = 2B; dB; + (dB;)?. 
In view of (12), we see that 
dB? = 2B, dB, + dt, 
or, in the integral form, 


t 
Bp =2 f B,dB, +t. 
0 


(16) 


(17) 
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EXAMPLE 2. Let F(x) =e” and X; = By. Then 
d(e8t) = e”t dB, + se (aby, 
i.e., bearing in mind (11), 
d(ePt) = e”t (aB: + 5 dt). (18) 
Let F(t, £) = erat and let X; = Bı. Then we formally obtain 
AP (t, Bi) = ~5 F(t, By) dt + F(t, By) dB; + FE (ty Be)(dBy)?. 
In view of (12), we obtain 
dF (t, Bi) = F(t, Be) dB. 
Considering the stochastic exponential 
8(B), = eBt-3¢ (19) 
(cf. formula (13) in Chapter II, § 1a), we see that it has the stochastic differential 
dé (Bj = (Bj dbz. (20) 


This relation can be treated as a stochastic differential equation (see § 3a and §3e 
further on), with a solution delivered by (19). 


EXAMPLE 3. Putting the preceding example in a broader context we consider now 
the process 


Zt = exo f b(s,w) dBs — sf Poswyast, (21) 


where b = (b(t, w}jt>0 is a nonanticipating process with 
t 
e(/ (s.u) ds < œ) =1, t>0. 
0 


t t 
x= f H(5,w) dB 5 | b7(s,w) ds 
0 2 Jo 


and F(x) = e” we can use Itô's formula (5) to see that the process Z = (Z¢)t>0 
(the ‘Girsanov exponential’) has the stochastic differential 


Setting 


dZ = Zr b(t, w) dBi. (22) 
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EXAMPLE 4. Let X; = By and let Y; = t. Then 
d( Bı t) = tdB, + Bidt, 


t t 
mef sdBs+ | B, ds (23) 
0 0 


t 
(cf. N. Wiener’s definition of the stochastic integral Í sdB;s, in §3c-1.) 


or, in the integral form, 


EXAMPLE 5. Let F(x1,22) = £122 and let X! = (X})is0. X? = (XP)e50 be two 
processes having Itô differentials. Then we formally obtain 


d(X}X?) = X} dX? + X? dX} + dX} dxè. (24) 
In particular, if 
dX} = a"(t,w)dt+b'(t,w)dBi,  i=1,2, 
then 
d(X} XÈ) = X} dX? + X? d XH. (25) 
On the other hand, if 
dX} =a*(t,w)dt+b'(t,w\dB, i=1,2, 


that 
d(X} XP?) = X} dX? + XP dX} + b(t, wb? (t, w) dt. (26) 


EXAMPLE 6. Let X = (X!,..., Xf) be a d-dimensional Itô process whose compo- 
nents X* have stochastic differentials (7). 

Let V = V(x) be a real-valued function of x = (21,...,2q) with continuous 
second derivatives and let 


d 
(LVE w) = Dr oilt o) e 


i=1 i j=1 


J2 BF (t, w) b(t, odas ee (27) 


Then the process (V(Xz)})¢p0 has the stochastic differential 
OV 
V(X) = (LiV) (Xz, w) dt + Ba (48) b(t, w) dB; (28) 


(we use the matrix notation), where 


d 
OV OV s5 i 
Bee (Xi)b = (Xab J 
p a Oa (29) 
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EXAMPLE 7. Let V = V (t,x) be a continuous real-valued function in [0, 00) x R? 
av a°V 


iha OV 
with continuous derivatives and . Also, let 


Ot? Ox; Ox; 02; 


t 
v=) C(s,w) ds, 
0 


where C = C(t,w) is a nonanticipating function with 


P( ff ic(s.u9lds <w) =1 t>0. 


Then (e~ V(t, Xz))zp0 is an It6 process with stochastic differential 
= _y, | OV 
d(e“¥* V(t, X:)) =e" ap (ts Xe) + (LeV) (Xe, w) = C(t,w)V(t, Xe)| dt 


+e ¥t Ya, X;) O(t,w) dBy. (30) 


5. Remark 2. Let X = (X:)tz0 be a diffusion Markov process with stochastic 
differential 
dX; = a(t, X;) dt + b(t, X+) dBi, 
where J \als, X5)| ds < 0, f(s, Xs) ds < oo (P-a.s.), and t > 0 (cf. formulas (1) 
in §3a and (9) in §3e). 
If Y; = F(t, X), where F = F(t,x) € Ch? and 9 > 0, then Y = (¥)e30 is 
also a diffusion Markov process with 


dY; = a(t, Yi) dt + B(t, Yi) dBi, 


where 


a(t, y) = PEG, x) + eb x)a(t, x) + = 


OF (t,x) 
Ox 


10°F 
z Dae (tt) O°(t, 2), (31) 
plt, y) = b(t, x (32) 
and t, x, and y are related by the equality y = F(t, x). 

These formulas, which describe the transformation of the (local) characteristics 
a(t, x) and b(t, x) of the Markov process X into the (local) characteristics a(t, y) and 
b(t, y) of the Markov process Y , have been obtained by A. N. Kolmogorov [280; § 17] 
as long ago as 1931. They are consequences of It6’s formula (5). It would be natural 
for that reason to call (31) and (32) the Kolmogorov-Ité formulas. 
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§3e. Stochastic Differential Equations 


1. Among It6 processes X = (X¢)zy0 with stochastic differentials 
dX; = a(t,w) dt + B(t,w) dBi, (1) 


a major role is played by processes such that the dependence of the corresponding 
coefficients a(t, w) and G(t,w) on w factors through the process X;(w) itself, i.e., 


a(t,w) = a(t, X;(w)), B(t,w) = b(t, X:(w)), (2) 


where a = a(t, x£) and b = b(t, x) are measurable functions on Ry x R. 
For instance, the process 


a2 
Si = Soe” 1P T, (3) 


which is called a geometric (or economic) Brownian motion (see § 3a) has (in ac- 
cordance with It6’s formula) the stochastic differential 


dS; = aS; dt + aS; dB. (4) 
It is easy to verify using the same Itô formula that the process 
tS 
Y; = f 2t du 5 
i 0 Su ( 
has the differential 
dy; = (1 + aY;) dt + oY; dBi. (6) 


(This process, Y = (Y;}t>0, plays an important role in problems of the quickest 
detection of changes in the local drift of a Brownian motion; sec [440], [441].) 
If 


td ‘dB 
Zt = S: [zo + (cy = sea) | S F ca | J (7) 
0 Yu 0 Su 


for some constants cı and cg, then, using Itô’s formula again, we can verify that 
dZ = (cy + aZz) dt + (c2 +aZt) dBi. (8) 


In the above examples we started from ‘explicit’ formulas for the processes 
S = (St), Y = (¥%), and Z = (Z;) and found their stochastic differentials (4), (6), 
and (8) using Itd’s formula. 

However, one can change the standpoint; namely, one can regard (4), (6), and (8) 
as stochastic differential equations with respect to unknown processes S = (St), 
Y = (Y+), Z = (Z) and can attempt to prove that their solutions (3), (5), and (7) 
are unique (in one or another sense). 

Of course, we must assign a precise meaning to the concept of ‘stochastic differ- 
ential equation’, define its ‘solution’, and explain what the ‘uniqueness’ of a solution 
means. The above-considered notion of a stochastic integral will play a key role in 
our introduction of all these concepts. 
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2. Let B = (Bi, Ftjezo be a Brownian motion defined on a filtered probability 
space (stochastic basis) (Q, F, (Ft)tz0, P) satisfying the usual conditions (§ 7a.2). 
Let a = a(t, xz) and b = b(t, £) be measurable functions on R} x R. 


DEFINITION 1. We say that a stochastic differential equation 
dX, = a(t, X;) dt + b(t, Xe) dB; (9) 


with Fo-measurable initial condition Xo has a continuous strong solution (or simply 
a solution) X = (X;z)ep0 if for each t > 0 


X; are ¥-measurable, 
+ 
(| la(s, Xs)| ds < ~} = 1, (10) 
J0 
t 
(| b? (s, Xs) ds < x} =1 (11) 
0 


and (P-a.s.) 
«t t 
X= Xo+ | als. Xaas + | b(s, Xs) dBs. (12) 
JO 0 


DEFINITION 2. We say that two continuous stochastic processes X = (X4)¢>9 and 
Y = (Y:)jtz0 are stochastically indistinguishable if 


P (sup IX; — aes 0) =0 (13) 
s<t 


for each ¢ > 0. 


DEFINITION 3. We say that a measurable function f = f(¢,z) on R} x R satisfies 
the local Lipschitz condition (with respect to the phase variable z) if for each n > 1 
there exists a quantity K (n) such that 


a(t, x) — a(t, y)| + |b(t, £) — b(t, y)| < K(m)|e — yl (14) 


for all t > 0 and x, y such that |z| < n and |y] < n. 


THEOREM 1 (K. It6 [242], [243]; see also, e.g., [123; Chapter 9], [288; Chapter V], 
or [303; Chapter 4]). Assume that the coefficients a(t, x) and b(t, x) satisfy the local 
Lipschitz condition and the condition of linear growth 


Ja(é,x)| + |b(t,x)| < K(1)IzI, (14") 


and that the initial value Xo is Fo-measurable. 
Then the stochastic differential equation (9) has a (unique, up to stochastic 
indistinguishability) continuous solution X = (X+, Ft), which is a Markov process. 
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This result can be generalized in various directions: the local Lipschitz condi- 
tions can be weakened, the coefficients may depend on w (although in a special way), 
one can consider the case when the coefficients a = a(t, X+) and b = b(t, X+) depend 
on the ‘past’ (slightly abusing the notation, we write in this case a = a(t; Xs, s < t) 
and b = b(t; Xs, 5 < t)). 


There also exist generalization to several dimensions, when X = (X!,...,X%) 
is a multivariate process, a = a(t,x) is a vector, b = b(t,x) is a matrix, and 
B = (B!,..., BÎ) is a d-dimensional Brownian motion; see, e.g., [123], [288], and 


[303] on this subject. 

Of all the generalizations we present here only one, somewhat unexpected re- 
sult of A. K. Zvonkin [485], which shows that the local Lipschitz condition is not 
necessary for the existence of a strong solution of a stochastic differential equation 


dX; = a(t, Xz) dt + dBi; (15) 


the mere measurability with respect to (t,x) and the uniform boundedness of the 
coefficient a(t,z) suffice. (A multidimensional generalization of this result was 
proved by A. Yu. Veretennikov [471].) 

Hence, for example, the stochastic differential equation 


dX, = o(X;) dt + dBi, Xo = 0, (16) 
with ‘bad’ coefficient 
(x) l AA (17) 
t)= 
5 -1, «<0, 


has a strong solution (moreover, a unique one). 
Note, however, that if in place of (16) we consider the equation 


dX, = o( Xz) dB:, Xo =0 (18) 


with the same o(x), then the situation changes drastically, because, first, this equa- 
tion is known to have at least two strong solutions on some probability spaces. 
Second, this equation has no strong solutions whatsoever on some other probabil- 
ity spaces. 

To prove the first assertion we consider a coordinate Wiener process W = (W:)r>0 
in the space of continuous functions w = (wt}t>0 on [0,+00) endowed with Wiener 
measure, i.e., a process defined as W;(w) = uz,t > 0. 

By Lévy’s theorem (see § 3b.3), if 


t 
B= | oW dW, 
0 


then B = (Bz)ty0 is also a Wiener process (a Brownian motion). Moreover, it is 
easy to see that 


t t 
f owoas.= f a? (Ws) dWs = Wi, 
0 0 


3. Models Based on a Brownian Motion 267 


because o?(x) = 1. 


Hence the process W = (Wz)es0 is a solution of equation (18) (in our coordi- 
nate probability space) with a specially designed Brownian motion B. However, 
a(—x) = —a(z), so that 


t t 
J a(—Ws) dBs = -f a(Ws)dBs = —Wi, 
0 0 


i.e., the process -W = (—W,)eso is also a solution of (18). 
As regards the second assertion, assume that the equation 


t 
Xp= [ o(Xs) dB; 
J0 


has a strong solution (with respect to the flow of o-algebras (FP is0 generated 
by a Brownian motion B). By Lévy’s theorem, this process X = (Xz, FP) 150 isa 
Browniau motion. 


By Tanaka’s formula (see § 5c or [402] and compare with the example in Chap- 
ter II, §1b) 


t 
Xl _ f o(Xs}dXs + L(0), 


where 


SRE oe ie 
LO = lim 5- [ I(|Xs| < £) ds 


is the local time (P. Lévy) of the Brownian motion X at the origin over the period 
[0, ¢]. 
Hence (P-a.s.) 


t 
B= | a(X5)dXs = |X| — £4 (0), 
0 


so that FP a FXI, 
The above assumption that X is adapted to the flow FP = (FP )tpo ensures 


the inclusion FX C FI, which, of course, cannot occur for a Brownian motion X. 
All this shows that an equation does not necessarily have a solution for an arbitrar- 
ily chosen probability space and an arbitrary Brownian motion. (M. Barlow [20] 
showed that (18) does not necessarily have a strong solution even in the case of a 
bounded continuous function o = o(X) > 0.) 
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3. Note that, in fact, the two above-obtained solutions of (18), W and —W, have 
the same distribution, i.e., 


Law(Ws, s > 0) = Law(—Ws, s 2 0). 


This can be regarded as a justification for our introducing below the concept of 
weak solutions of stochastic differential equations. 


DEFINITION 4. Let p be a probability Borel measure on the real line R. We say 
that a stochastic differential equation (9) with initial data Xo satisfying the rela- 
tion Law(Xo) = x» has a weak solution if there exist a filtered probability space 
(Q, F, (Ft)tz0,P), a Brownian motion B = (Br, Ft)tz0o on this space, and a con- 
tinuous stochastic process X = (Xi, Ft)ezo such that Law(Xo{P) = x and (12) 
holds (P-a.s.) for each ¢ > 0. 


It should be pointed out that, by contrast to a strong solution, which is sought on 
a particular filtered probability space endowed with a particular Brownian motion, 
we do not fix these objects (the probability space and the Brownian motion) in the 
definition of a weak solution. We only require that they exist. 

It is clear from the above definitions that one can expect a weak solution to 
exist under less restrictive conditions on the coefficients of (9). 

One of the first results in this direction (see [446], [457]) is as follows. 

We consider a stochastic differential equation 


dX; = a(X;) dt + b(X;) dB; (19) 


with initial distribution Law(Xo) = yw such that [20% nde) < oo for some 
e>0. 

If the coefficients a = a(x) and b = b(x) are bounded continuous functions, then 
equation (19) has a weak solution. 

If, in addition, b? (zx) > 0 for x € R, then we have the uniqueness (in distribution) 
of the weak solution. 


Remark 1. In fact, if b(x) is bounded, continuous, and nowhere vanishing, then 
there exists a unique weak solution even if a(x) is merely bounded and measurable 
(see [457]). 


4, The above results concerning weak solutions can be generalized in various ways: 
to the multidimensional case, to the case of coefficients depending on the past, and 
So On. 

One of the inost transparent results in this direction is based on Girsanov’s 
theorem on an absolutely continuous change of measure. In view of the impor- 
tance of this theorem in many other questions, we present it here. (As regards the 
applications of this theorem in the discrete-time case, see Chapter V, § § 3b and 3d.) 
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Let (Q, F, (Fi)tz0, P) be a filtered probability space and let B = (Bi, Ft)i>0, 


B= (BŁ, ..., B4), be a d-dimensional Brownian motion. Let also a = (at, Ft), 


a = (al,..., a‘), be a d-dimensional stochastic process such that 


t 
(| lall? < ~} = 1, t<ST, (20) 
0 
1 


where Ilai]? = (a})? feet (at)? and T < œ. 
We now construct a process Z = (Zt Fiji<r by setting 


Z= epf | (aB) = sf Jasl?ast, (21) 


where 


is the scalar product. 
If 


1 t 
Eexp 5 | asi? as <% 


(the Novikov condition; cf. also the corresponding condition in the discrete-time 
case in Chapter V, §3b), then 


EZr =l, (22) 


so that Z = (Zt, Fi)tcr is a uniformly integrable martingale. 
Since Zy is positive (P-a.s.) and (22) holds, we can define a probability measure 
Pr on (Q, Fr) by setting 


Pr(A)=El[aZr], AE Fr. 
Clearly, if Pr = P| Fr is the restriction of the measure P to Fr, then Pr ~ Pr. 
THEOREM 2 (I. V. Girsanov [183]). Let 


by t 
B= Be f a,ds, tT. 
0 


Then B = (Bi, Ft, Porter is a Brownian motion. 


The proof can be found in [183] and in the present book, in Chapter VII, § 3b; 
see also in [266] or [303]. 
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We now consider the question of the existence of a weak solution of the one- 
dimensional stochastic differential equation 


dX; = a(t, X)dt + dBi, (23) 


where the coefficient a = a(t, X) is in general assumed to depend on the ‘past’ 
variables X,, s < t. (The case of d > 1 can be considered in a similar way.) 

Let C be the space of continuous functions z = (24)i30, zo = 0, let © = 
o(@: s,s < t), and let € = o( U G) Also, let PW be the Wiener measure 
in (C, 6). t20 

We say that a functional a = a(t, x), where t € Ry and z € C, is measurable 
if it is a measurable map from R+, x C into R, and it is said to be progressively 
measurable if, in addition, for each ¢ > 0 and each Borel set A we have the inclusion 


{(s <t, z €C): a(s,xr) € A} E B([0, t]) D Gi. 


We shall consider equation (23) for t < T with Xo = 0 (for simplicity) and shall 
assume that: a = a(t,x) is a progressively measurable functional, 


pide [eena < co} =i, (24) 


and 


‘T 


opf | atsam) = al (6x) at} zi (25) 


where W = (Wi(x))¢>0 is the canonical Wiener process (W(x) = z+) and EW) 
is averaging with respect to the measure PW. 

In accordance with our definition of a weak solution, we must construct a prob- 
ability space (Q, F, (Ft)t<T, P) and processes X = (Xi, F4) and B = (B;, F4) on 
it such that B is a Brownian motion with respect to the measure P and 


t 
Xt -| a(s, X)ds + Bi (26) 
0 


(P-a.s.) for each t < T. 
We now set 
QN=C, F=E, Fi = Gi 


and define a measure P in Fr by setting 


Pr(dz) = Zp(x) PY (dz), 
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where z Lt 
Zr(z)= exo{ | a(t. x) dWi(x) — a a(t, x) ar}. 


The process 


t 
Bi(x) = Wilg) — 1 a(s, W(z}) ds, t<T 


on the probability space (Q. Fr, Py) is a Brownian motion by Girsanov's theorem. 
Hence. setting X:(x) = W(x) we obtain 


t 
Xi(x) = i a(s. X(x)) ds + B(x). t ST. 
JO 


which precisely proves the existence of a weak solution to the stochastic differential 
equation (23) (under assumptions (24) and (25)). 


Remark 2. Conditions (24) and (25) hold for sure if |a(t,x)| < c for allt < T 
and z € C. Hence equation (23) has a weak solution in this case. We point out. 
however. that such an equation does not necessarily have a strong solution. as 
shows an example suggested by B. Tsirelson (see. for instance, [303: §4.4]). In 
this connection we recall that an equation dX; = a(t. Xz) dt + dBi with coefficient 
a(t. Xt) dependent only on the ‘present’ X;¢, rather than on the entire ‘past’ Xs, 
s < t. (as in (23)) has not merely a weak solution. but a strong one (see subsection 2 
above for a description of A. K. Zvonkin’s result [485]). 


» 3f. Forward and Backward Kolmogorov’s Equations. 
Probabilistic Representation of Solutions 


1. Below we expose several results and methods of the theory of diffusion Markov 
processes that have their origin in A.N. Kolmogorov’s fundamental paper “Über 
die analitishen Methoden in der Warscheanlichkettsrechnung” [280] (1931). 

P. S. Aleksandrov and A. Ya. Khintchine [5] wrote about this work. which re- 
vealed close relations of the theory of stochastic processes to mathematical analysis 
in general and the theory of differential equations (both ordinary and partial) in 
particular: 


“In entire probability theory of the 20th century it is difficult to find 
another study as fundamental for the further development of the sc- 
ence...” 


In [280] Kolmogorov does not consider the trajectories of, say. Markov processes 
-X = (Xt)t>0. He studies instead the properties of the transition probabilities 


P(s.z:t. A) = P(X% € A| X, = 2), zeER, Ac A(R), 


ie.. the probability of the event that a trajectory of X arrives in the set A at time 
t, provided that X, = x at time s. 
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The starting point for his analysis is the equation 
P(sait, A) = f P(s, ziu, dy) PCa, y;t, A) (1) 
R 


(0 <s <u <t), which expresses the Markov property. (One usually calls (1) the 
Kolmogorov-Chapman equation.) 
Assuming that there exist densities 


OF (s.2;t,y) 
Lo hy) Se 2 
f(s, a3t,y) By (2) 


where 
F(s, T; t, y) = P(s, T; t, (—00, yl), 


and limits 


alsa) = imi (y — x) f(s,z;s + A, y)dy, (3) 
Peas lim x f (y — x)? f(s,x;s +A,y) dy, (4) 


and imposing certain regularity conditions on the functions in question and also 
the condition 


lim 5 Í ly — x|? t? f(s, xis+A,y)dy=0 (5) 


with ô > 0, Kolmogorov derives from (1) the following backward parabolic differen- 
tial equations (with respect to x € R and s < t) for such diffusion processes (see 
[280] or, e.g., [170] and [182] for detail): 


Pf 
Le <- 6 
ERTO (6) 
He also obtains the forward parabolic equations (with respect to y € R and t > s): 
1 8 
Bt = ag eeN] + aat (ty) f]. (7) 


He discusses the existence, the uniqueness, and the regularity of the solutions 
to these equations. (We note that equations (7) had been considered before by 
A. D. Fokker [161] and M. Planck [389] in their analysis of diffusion from a physical 
standpoint.) 
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2. The theory and the methods of Markov processes made the next major step in 
the 1940s and the 1950s, in papers by K. It [242]-[244], who sought an ‘explicit’, 
effective construction of diffusion processes (and also diffusion processes with jumps) 
with well-defined local characteristics a(s, £) and b?(s, x) (see (3) and (4)). 

He succeeded by the representing the desired processes in terms of some ‘basic’ 
Brownian motion B = (Bz)ty0, as solutions of stochastic differential equations 


dX, = a(t, Xz) dt + b(t, Xz) dBi. (8) 


In accordance with [242]-[244] (see also § 3e), equation (8) with Xo = Const and 
with coefficients satisfying the local Lipschitz condition and linear growth condition 
with respect to the phase variable has a strong solution, which, moreover, is unique. 
If, in addition, the coefficients a(t, x) and b(t, x) are continuous in (t,x), then X is 
a Markov diffusion process, which has, in particular, properties (3)—(5). Hence, if 
we add certain conditions on the smoothness of the transitional densities and the 
coefficients a(t,x) and b(t, x) (see, e.g., §14 in [182] for more detail), then both 
forward and backward Kolinogorov equations are satisfied. 

In a similar way one can treat the case of d-dimensional processes 
X =(X!,..., X%) satisfying the equations 


d 
dX} =a"(t, Xe) dt + XC 0 (t, Xt) dB}. (9) 
j=l 
Setting 
oe d . . 
gi = 5 pikpik, (10) 
k=l 
, BF. ıd af 
Lís.: = 46 Os ea ij 
(s,r)f 2 Clar +3 Pee (s, x) Ona)? (11) 
= j= 
and 
d a 1 æ 
“(ty) a le (e, 5 It y)f 12 
y)f Saa V (t, y)f 149 2a ITN vf] (12) 


(cf. formula (27) in §3d) we can write the backward and the forward Kolmogorov 
parabolic equations in the following form: 


ET L(s,£)f (13) 
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and 
a, = L(t yf. (14) 


We point out the case when the a’ and b} are independent of time, i.e., 
a’ = a’ (x) and b = bi (x). 
In this case, for 0 < s < t we have 


f(s, a3t,y) = f(0,2;t — s,y). 


Introducing the function g = g(z,t;y) = f(0,x;t,y) e see from (13) that it satisfies 

the following parabolic equation with respect to (t,x) 
ag 
=> =L 5 
a = EEI (15) 

d 2) 29 it en a7 
h L = — 29 S 
where L(x) g= Dal ae += RN (7) 55.0%) 


3. We proceed now to a discussion of several well-known results showing how one 
can obtain probabilistic representations (in terms of Brownian motions and solu- 
tions of stochastic differential equations) for the solutions of several classical prob- 
lems of partial differential equations. 


A. Cauchy problem 


Find a continuous function u = u(t, x) in the domain [0, 90) x R? such that 
u(0,x) = p(z) (16) 


where y = (x) is some fixed function satisfying one of the parabolic equations 
below (cf. (15)). 


Al. The heat equation. 


poh mee, 
Of 2c ae 
d 82 
where A is the Laplace operator A = X —> 
f=1 ax? 
The function 
v(t, £) = Ex f (Bt) (17') 


is a solution of the Cauchy problem for (17); it is called the probabilistic solution. 
By E, in (17’) we mean averaging with respect to the measure Py corresponding 
to a Brownian motion starting from z; i.e., Bo = z.) 
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A2. The inhomogeneous heat equation. 


Ou 1 
— = -A 
Ai 5 uty, (18) 


where w = y(t, ©). 


The probabilistic solution is as follows: 


(t,x) = Ez (e81 $ [vw auia is), (18!) 


A3. The Feynman—Kac equation. 


au = sau + cu, (19) 


where ¢ = c(z). 
Its probabilistic solution is 


v(t.) = E(B) | "d as} ). (19') 


A4. The Cameron—Martin equation. 


Ou 1 
= V 
E = ZA + (a, Vu), (20) 
ð ð 
= (al g = ea 
where a = (a (2)... ,a%(@)) and Y = (5. zee). 


Its probabilistic solution is 


doje: (oBer! [aca ae, : an ja( By) |? ds} ). (20') 


Remark 1. Of course, if one requires that the functions v(t, xz) in (17’)--(20’) be 
solutions, then one needs first of all that the expectations in these formulas be well 
defined. To this end it suffices, for instance, that the functions p(z), w(t, x). c(x), 
and a(x) be bounded. (These assumptions turn out to be excessively restrictive in 
many problems; in that case the question should be studied more carefully.) 


Remark 2. In our discussion of ‘probabilistic solutions’ we do not mean solutions in 
some special ‘probabilistic’ class; these solutions are merely defined in probabilistic 
terms such as averaging with respect to a Wiener measure and so on. 
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B. Dirichlet problem 


One looks for a function u = u(x) in the class C? in a bounded domain G C Ri 
that satisfies the equation 
Au = 0, rEG (21) 


(a harmonic function) and the condition 
u(x) = (z£), x € ðG. (22) 
Let 
rv =inf{t: By ¢ G}. 


Then the probabilistic solution is 
v(x) = Exy(Br). (23) 


Remark 3. Here one must also impose certain conditions on the domain G for 
r = T(w) to be a finite Markov time; it is also necessary that y(B;) be integrable. 


4. We now discuss the main steps in the proof that v(x) in (23) is indeed the 
solution of (21) (under several additional assumptions). 

Let G be a bounded open subset of RÊ and let y = y(x) be a bounded function. 
We shall also assume that v(x) = Ezy(B,) isa C2-function. Then by Ito's formula 
we obtain (on [0, r] = {(w,t): t < r(w)}) 


t t 
(Bi) = (Bo) + 5 | (A0)(B:)ds+ | (V0)(B,) dB (24) 


If v satisfies the condition 


(LE (se) e<) 


then the last integral in (24) is a local martingale on [0, r[| (see [250; p. 88]). 
By the Markov property of a Brownian motion, 


Ex(o(Br) | Ft) = v(B:) (25) 


(P-a.s.) on the set [0,7], i.e., (v(Bt)) is a martingale on this set. 
t 
Hence it follows from (24) that the integral p Av(Bs) ds) is a local mar- 


tingale on [0,r[. At the same time it is a continuous process of bounded varia- 
tion, therefore this is a zero process. (This implication, which is a consequence 
of the Doob -Meyer decomposition for submartingales, is routinely used when one 
must verify that a probabilistic solution really satisfies one or another equation; see 
(250; Chapter I, § 3b] and § 5b below.) 
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Hence we conclude that Av(z) = 0, z € G, because if this equality fails at 
a point zọ € G, then Av(x) Æ 0 in a neighborhood of zo by the continuity of 


Av(zx), so that, with positive probability, the process Cs Av(Bs) ds) < is nonzero 
tSr 
on [0,7]. 


Remark 4. As regards the unique solvability of Dirichlet problems for elliptic equa- 
tions and the representation of an arbitrary solution u(x) of such a problem as 
E,y(B,) see, e.g., [123; 8.5], or [170; vol. 1, Chapter 6, § 2]. 

5. On the conceptual level, the issue of probabilistic solutions to a Cauchy problein 
can be treated conceptually much the same way. 

For example we consider now the heat equation (17); we claim that the function 
v(t, x) = Ez p(B) is a solution to this equation (under several additional assump- 
tions) and satisfies the mitial condition u(0, x) = y(x) 

Indeed, by the Markov property of a Brownian motion, 


E.(f (Bt) | Fs) = v(t — 8, Bs). (26) 


Since (Ex(f(Bt) | Fs)) ee 
(v(t ~s, Bs)) sc is also a martingale. 


is a martingale (for |f| < c, at any rate), the process 


Further, if v € CÊ, then we can use Itô’s formula to obtain 
t/ ðv 
u(t — s, Bs) = v(t, Bo) + ET += sae (t — r, B,) dr 
0 


t 
+f Vo(t =r, Bp) dB. (27) 
0 


Phi oe : eG (-n.B)) dr cco} = (28) 


for each t, then the last aice in (27) is a local martingale. Hence the process 


t/ ðv 
=a An) t-r, B,)dr) 
(f ( ot ( t20 


is indistinguishable (P-a.s.) frorn a zero process, because it is both a local martingale 
and a continuous process of bounded variation. 
If, in addition y = (x), is a continuous function, then v(t, x) > p(x) as t > 0. 
In a similar way, using It6’s formula, one can prove that, indeed, formulas 
(17')-(20') describe probabilistic solutions to the problems (17)-(20), respectively. 


If 


Remark 5. As regards a more thorough discussion of the above-stated Cauchy 
and Dirichlet probleins for parabolic equations and also the corresponding prob- 
lem for the forward and backward Kolmogorov equations, see, e.g., [123], [170], 
[182], and [288]. 


4. Diffusion Models of the Evolution 
of Interest Rates, Stock and Bond Prices 


§ 4a. Stochastic Interest Rates 


1. The simplest model where one encounters stochastic interest rates r = (rn)n>1> 
is the model of a bank account B = (Bn)n so, in which (by definition) 


ABn 
ot 1 
Bn-1 ( ) 


Tn = 


(Q, F, (Fn)nz0.P) is a stochastic basis (a filtered probability space) describing 
the stochastics of a financial market and the available ‘information’ (Fn)n>0 about 
this market, then, as already pointed out (Chapter II, § 1e), it is natural to assume 
that the state Bn of the bank account is Fn—1-measurable. 

Hence both sequences B = (Bn)nzo and r = (rn)n>1 are predictable (see Chap- 
ter II, § la). 

This feature explains why one usually imposes the requirement of the predictabil- 
ity of the interest rates r = (r(t))¢>0 in the continuous-time case (see § 5a and, 
for greater detail, [250; Chapter I]); this requirement is automatically satisfied if 
r = (r(t))t>0 is a continuous (or only left-continuous) process. 

In what follows we consider only models in which the interest rates r = (r(t))¢>0 
are diffusion processes and, therefore, have continuous trajectories (so that the 
additional requirement of predictability becomes superfluous). 

In the case of continuous time ¢ > 0 the standard definition of the bank interest 
rate r = (r(t))¢>0 is based on the relation 


dB, = r(t)B: dt, (2) 


which is a natural continuous analog of (1). 
Clearly, 
r(t) = (In By)’ (3) 
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B; = Bo ed ['r(syash. (4) 


The concept of interest rate (short rate of interest, spot rate, instantaneous in- 
terest rate) plays an even more important role in the ‘indirect data’ of the evolution 
of share prices (see § 4c below). This explains the existence of a variety of models 
with interest rates r = (r(t))¢>0 described by diffusion equations 


and 


dr(t) = a(t, r(t)) dt + b(t, r(t)) dW (5) 


or, say, by equations of ‘diffusion-with-jumps’ type 
dr(t) = a(t,r(t)) dt + b(t, r(t)) dW; + f elt.r(t=).2) (ulat, ar) —v(dt,dxr)). (6) 


u = u(dt, dx) is a randoni Poisson measure in R; x E x R¢, and v = v(dt, dz) is its 
compensator (see [250; Chapter III, § 2c] for greater detail). 


2. We discuss now several popular models of interest rates r = (r(t))¢50 falling 
in the class of diffusion models (5) with a standard Wiener process (a Brownian 
motion) W = (W;)t>0 defined on some stochastic basis (Q, F, (Ft)t>0, P). 
The Merton model (R. C. Merton, [346] (1973)): 
dr(t) = a dt + y dW. (7) 
The Vasiček model (O. Vasiček [472] (1977)): 
dr(t) = (œ — Br(t)) dt + ydW. (8) 
The Dothan model (M. Dothan [111] (1978)): 
dr (t) = ar(t) dt + yr (t) dW. (9) 


The Cox-Ingersoll-Ross models (J. C. Cox, J. E. J. Ingersoll, and S. A. Ross 
[80] (1980) and [81] (1985)): 


dr(t) = 8 (r(t))?? dW, (10) 
and 
dr(t) = (a — Br(t)) dt + y(r(t))? aw. (11) 
The Ho-Lee model (T. Ho and S. Lee [224] (1986)): 


dr(t) = a(t) dt + ydW. (12) 
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The Black-Derman-Toy model (F. Black, E. Derman, and W. Toy [42] (1990)): 


drt) = a(t)r(t) dt + y(t) dWr. (13 
The Hull- White models (J. Hull and A. White, [234] (1990)): 
dr(t) = (a(t) ~ B(t)r(t)) dt + y(t) dWr, (14 
and 
dr(t) = (a(t) — B(t)r(t)) dt + y(t) (r(t))/? dW. (15 
The Black- Karasinski model (F. Black and P. Karasinski [43] (1991)): 
dr(t) = r(t)(a(t) — A(t) mr(t)) dt + y(t)r(t) dWr. (16 


The Sandmann-Sondermann model (K. Sandmann and D. Sondermann [422 


(1993)): 


r(t) = In(1 + €(t)), (17 
where 
dé(t) = E(t) (a(t) dt + y(t) dW;). (18) 
3. Introducing these models their authors also explain the motivations behind 
them. 

For instance, the Vasiček model (8) comes naturally with the assumption that 
the interest rate oscillates near a fixed level œ/8. (It is clear from (8) that the 
process has a positive drift for r(t) < a/f, and a negative drift for r(t) > a/ß; 
if œ = 0, then (8) turns to the Ornstein-Uhlenbeck equation discussed in § 3a.) 

It should be noted that, as shown by many empirical studies of bond rates (see, 
e.g., [69] and [70]), one cannot in general say that interest rates display a trend of 
returning to some fired mean value (a//3) (mean reversion). 

This is taken into account in the Hull-White models, in which the variable level 
a(t)/@(t), t > 0, replaces the constant a/{. 

Moving ever further, we can assume that this variable level is itself a stochastic 
process. The following model provides a suitable example here. 

The Chen model (L. Chen ([70], 1995)): 


dr(t) = (a(t) — r(t)) dt + (y(¢)r(t))'? aw}, (19) 

where a(t) and y(t) are stochastic processes of diffusion type: 
da(t) = (a — a(t)) dt + (a(t)) awè, (20) 
dy(t) = (y — o(t)) dt + (y(t))'? aw}; (21) 


(here œ and y are constants, while W!, W2, and W? are independent Wiener 
processes), 

In many of the above models the diffusion coefficient (‘volatility’) depends on 
the current level of the interest rate r(t). This can be explained, for instance, as 
follows: if the interest rate strongly increases, then the risks of investing into the 
assets in question are also higher. These risks are reflected by the fluctuation terms 
(e.8., (r(t))\/? dW;) in the equations for r(t). 
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4. We now present another (fairly simplified) model of the evolution of interest 
rates, which was inspired by the following considerations. 

Plausibly enough, the stochastic process r = (r(t))¢s0 is, to a certain extent, a 
reflection of the state of the ‘economy’, a judgnient on this state. 

Taking this as a starting point, we shall now assume that the state of the ‘econ- 
omy’ can be sinrulated by. say, a homogeneous Markov jump process 0 = (0(t))+>0 
that has (for simplicity) only two states, i = 0 and i = 1. Let P(6(0) = 0) E 
P(4(0) = 1) = } and assume that the transition probability densities A;; satisfy 
the equalities Aj; = ~A and Ajj = A for iF j. 

Thus, the ‘economy’ switches between the states į = 0 and 7 = 1, and the time 
of each stay is distributed exponentially, with parameter À. 

We also assume that we can judge the state 0 = (@(t))150 of the ‘economy’ only 
by indirect data: namely. we can observe the process X = (X+)+>0 with differential 


dX; = (t) dt + dW, (22) 


where W = (Wi)s>0 is some Wiener process. 
Now let 
r(t) = E(8(t)| FE) (23) 


be the optimal (in the mean square sense) estimator for 6(t) on the basis of the 
observations Xs, s < t, (FX =a(Xs, s < t)). 
By general nonlinear filtering theory (see [303; formula (9.86)]), 


dr(t) = A(1 — 2r(t)) dt + r(t) (1 — r(t)) (dX: — r(t) dt). (24) 


We note that the process W = (W:)t>0, where 
t 
Wi = Xt - | r(s) ds, (25) 
0 


is a Wiener process with respect to the flow (FË eso (see Chapter VII, §3b and 
[303; Theorem 7.12]; the process W is called an innovation process). Hence (24) 
can be rewritten as 


dr(t) = A(1 — 2r(t)) dt + r(t)(1 — r(t)) dWr, (26) 


wliich is an equation of type (5). It is also interesting that, in view of (22), by (24) 
we obtain 


dr(t) = a(r(t),0(t)) dt + b(r(t)) dWr, (27) 


where 
a(r,6) = A(1— 2r)+r(1-r)(8 ~r) and d(r)=r(l—r) 


(cf. the Chen model (19)). 
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5. The above models of the dynamics of interest rates r = (r(t})z>0 rely on sto- 
chastic differentia! equations that involve some basic Wiener process. 

Incidentally, many of these equations admit ‘explicit’ solutions that are func- 
tionals of this Wiener process. For instance, the solution of the equation (8) in the 
Vasiček model and of its generalization, equation (14) in the Hull-White model, 
has the representation 


r(t) = g(t) bo ae [ “Gp ds + I ue aw, | (28) 


(because (8) and (14) are linear equations), where 


g(t) = exp} - | ' Bl8) ds}. (29) 


One can see this easily using Itô’s formula; cf. formula (7) in § 3e. 


We now set : j 
- [29 
ro- (8) e w 


Assuming that T(t} < co for all t > 0 and T(t) > œ as t > œ, we now introduce 
‘new’ time @ by the equality 6 = T(t) (see Chapter IV, §3d for a more detailed 
discussion of this time as ‘operational’). 

As is well known (see, e.g., [303; Lemma 17.4]), under these assumptions there 
exists a (new) Wiener process W* = (W/)i>0 such that 


* (a(s) A 
f (GG) m= Wio g 
Hence we obtain the following representation for r(t) in (28): 
r(t) = f(t) + IOW ep (32) 
where Eat 
If 


T*(0) = inf{t: T(t) = 9}, 


then we can pass back from ‘new’ time 6 = T(t) to ‘old’ time using the formula 
t = T*(0). This shows that we can write our process as r*(0) = r(T*(@)) and the 
process r* = (r*())g50 has a very simple structure. Namely, 


r*(0) = F (0) + 9° (O)We, 
where f*(@) = f(T*(@)) and g*(8@) = g(T*(9)). 
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In the Black-Karasinski model (16) we have 


dlur(t) = (a(t) ~ y(t) — B) Inr(t)) dt + y(t) dW. (34) 
Defining T(t) by the same formula (30) we see that 
r(t) = F (FE) + IOW) (35) 


(for some other Wiener process, W* = (Wž})t>0) where g(t) is as in (29), 


tay s)-—- 2 S 
f(t) = g(t) [o+ f as (36) 


and F(x) =e”. 
In a similar way, for the Sandmann—Sondermann model (17)~(18) we obtain 


r(t) = F(f(t) + Wea), (37) 


where F(x) = In(1 + e”), T(t) =f (s) ds, and 


f(t) =ing(o) + f *(a(s) ~ £4°(s)) as. 


W. Schmidt [426] has observed that in all the above ‘explicit’ representations 
the interest rates r(t} have the form (35). This brings one to the following, rather 
general, model. 

The Schmidt model (W. Schmidt [426] (1997)): 


r(t) = F(f(t) + 9(t)Wrqy), (38) 


where W = (W4)t>0 is a Wiener process, T(t) and F(x) are continuous nonnegative 
strictly increasing functions of t > 0 and x € R, respectively, while f = f(t) and 
g = g(t) > 0 are continuous functions. 

Note that ‘new’ time @ = T(t) in this model is a deterministic function of ‘old’ 
time t, so that (X+ = f(t) + IWT) is a Gaussian process. Choosing suit- 
able functions F(x) one can obtain various probability distributions of the interest 
rates r(t). 


6. The Schmidt model (38) is also attractive in the following respect: its ‘discrete’ 
version enables one to build easily discrete models of the evolution of interest rates, 
using one or another random-walk approximation to a Wiener process. 

For instance, if 


T™® = inf {t >0: T(t) > 2) 
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for n > 1, where 7 = 0,1,... and (6) is a sequence of Bernoulli random 
variables with pce”) = +1//n) = $, then the (piecewise constant) process 


WO) = (WL) 59 with 


[nT (t)] 
w = 3 AOR wi Ei 


weakly converges (as n —+ 00) to a Wiener process W = (Wih>0. 
We now set 


ri) A m) 46) p< + Lj 
PG) = P(r ) + g(t; Z) j=0,41,..., £i. 


To obtain a discrete-time model of the evolution of the interest rates rin) rang- 
ing in the sets {ri G). i = QO, +1,...,42}, let 7) = ro and assume that, with 
probability 4 5, the state rí W j =0,1,..., +i, changes either to rn ) (j +1) or 


) 
to rD Oj j - 1). 
Clearly, this construction brings one (see [426]) to the binomial model of the 
evolution of interest rates, which can be depicted as follows (for a given value of 
n=1,2,...): 


§4b. Standard Diffusion Model of Stock Prices 
(Geometric Brownian Motion) and its Generalizations 


1. We have already pointed out (Chapter I, § 1b) that the first model developed 
for the description of the evolution of stock prices S = (5;)>0 had been the linear 
model of L. Bachelier (1900) 


St = So + ut + 0oWr, (1) 


where W = (W;)t>0 was a standard Brownian motion (a Wiener process). 
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Although, in principle, it was a major step in the application of probabilistic 
concepts to the analysis of financial markets, it was clear from the very beginning 
that the model (1) had many deficiencies. The first of them was that the variables S; 
(that were supposed to represent stock prices) could take negative values. 

Essential in this connection was the next step, made by P. Samuelson [420]. He 
suggested to describe stock prices in terms of a geometric (or, as he has put it, 
economic) Brownian motion 


pa 
St = Soet eW Ft (2) 


That is, Sanmelson assumed that the logarithms of the prices S; (rather then 
the prices themselves) are governed by a linear model of type (1), so that 


2 
get = (u- T )etow, (3) 


By It6’s formula (§ 3d) we immediately obtain that 
dS; = St (us dt + a dW). (4) 


If we rewrite this relation in the following (not very rigorous) form 


dS 

Tot = udt +0 dW, (5) 
St 

then one can reasonably consider its discrete-time approximation (with time step A; 


here we set AS; = Si — S-A} 


AS; 
St-a 


x uA +A, 


which is very much similar to the expression 


AS» 
Sn-1 


= Pn (6) 


with ¥,-measurable pp, which we have already discussed (see, e.g., Chapter II, 
§ 2a.6) in the description of the evolution of stock prices proceeding in discrete 
time. 

It is also instructive to compare (5) with the expression for the (concomitant) 
bank account B = (Bz)ts0 with (fixed) interest rate r > 0, which is governed by 
the equation 

dB; = rB;, dt. (7) 


286 Chapter III. Stochastic Models. Continuous Time 


If the financial market under consideration is formed by a bank account 
B = (Bz)t>0 and stock of price S = (St)t>0 governed by equations (7) and (4), 
respectively, then we shall say that we have the standard diffusion (B, S\-model or 
a standard diffusion (B. S)-market. 

This standard diffusion model was considered in the calculations of option prices 
by F. Black and M. Scholes [44] and R. Merton [346]. The famous Black-Scholes 
formula for the rational (fair) price of European call options was invented just in 
the discussion of this model. (We devote Chapter VIII to these issues.) 

It is fairly obvious that the standard model is based on assumptions that are not 
pertectly practical. In fact. it is implied that the interest rate r of the bank account 
is constant (while it actually fluctuates); the coefficients œ of volatility and u of 
growth must also be constant (in fact. they change with time). In the deduction of 
the Black-Scholes formula one also assumes (see Chapter VIII, §§ 1b. c below) that 
the (B.S)-market is ‘frictionless’ (no transaction costs. no dividend payments, no 
delavs in obtaining information or taking decisions), one can withdraw (or deposit) 
anv amount from the bank account. and buy or sell any number of shares. 

All this suggests that the standard diffusion (B.S)-market is. of course. a sim- 
pixication. Nevertheless. it remains one of the most popular models. 

The following words of F. Black concerning the ‘simplicity’ of this model 
([43!. 1988) are noteworthy in this connection: 


“Yet that weakness is also its greatest strength. People like the model 
because they can easily understand its assumptions. The model ıs of- 
ten good as a first approximation. and if you can see the holes in the 
assumptions you can use the model in more sophisticated ways.” 


2. One of these refinements. immediately suggesting itself, is to consider models 
with constants r. u. and ø replaced by deterministic or random (adapted to (Ft)+>0) 
functions r(t). u(t), and a(t): 


dB, = r(t) Be dt. (8) 
dS, = Se (u(t) dt + olt) dW). (9) 


i 


Of course. in our assessments of more complicated models we must be primarily 
backed by facts that have been established experimentally and cannot be explained. 
‘captured’ by the standard (£.5)-model. On the other hand. these more refined 
models should not becom* complicated to a point where it is ‘no more possible to 
calculate anvthing’. 

In this connection we should mention the so-called ‘smile effect’. This is just one 
fact that the standard (B. S)-model fails to explain: and this failure has brought 
on various versions of its generalizations and refinements. 

The smile effect is as follows. 
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Assume that the price of a stock is described by equation (4) with Sp = 1 (for 
simplicity) and let C = C(a,T, K} be the rational price of the standard European 
call option with payoff fr = max(Sr — K,0). 

The Black-Scholes formula for C (see formula (9) in Chapter I, § 1b or, for a more 
thorough discussion, Chapter VIII, §§1b,c) explicitly describes the dependence of 
this price on the volatility ø, the exercise time T, and the strike price K. 

However, we can consider the actual market prices of such options with given T 
and K and compare these with the values of C. 

Let C(T, K) be this (actual) price asked by financial markets. 

We now define ¢ = a(T, K} as the solution of the equation 


C(o,T, K) = C(T, K), 


where C(ø, T, K} is the function given by the Black-Scholes formula. 

It is precisely this characteristic, (T, K}, which is called the implied volatil- 
ity, that shows out certain ‘holes’(using the expression of F. Black) in the initial 
standard model. For these are experimentally established facts that 


(a) the value of ¢(7, K} changes with T for fixed K; 
(b) the value of (T, K} also changes with K for fixed T as a (downwards) 
convex function (this explains the name ‘smile effect’). 


To take (a) into account R. Merton proposed ((346], 1973) to regard u and ø in 
the standard model as functions of time u = u(t) and o = a(t). Schemes of that 
kind are indeed used in financial analysis, in particular in calculations involving 
American options. 

The smile effect (b) appears more delicate; various modifications of the standard 
model (‘diffusion with jumps’, ‘stochastic volatility’, and others models) have been 
developed for its explanation. 

The most transparent of them is the model suggested by B. Dupire [121]. [122], 
in which 


dS; = St (u(t) dt + a(S, t) dW), (10) 


where o = o(S,t) is a function of the state S and time t. 

In already mentioned papers [121] and [122] Dupire showed also that accept- 
ing the idea of an ‘arbitrage-free complete market’ one can estimate the unknown 
volatility on the basis of the actual (observed at time ¢ < T for the states S; = s) 
prices Cs (K,T) of standard European call options with exercise time T and strike 
price K. 


3. Conceptually, the process of the refinement of the standard (B,S)-model (4) 
and (7) has much in common with the development of the most simple discrete- 
time models. 

On agreeing upon this we recall (see Chapter II, § 1d) that in the discrete-time 
case we started from the representation 


Sn = Spel (11) 
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with Hn = hy +++) + hn and hn = Un + On€n, where tm and on were some 
(nonrandom) numbers and en ~ (0,1). 
If the coefficients u(t) and a(t) in (9) are deterministic, then we can write S; as 
follows: 
St = Soe", (12) 


H; = [ (sts — ee) a + [ow dW. (13) 


0 


where 


Clearly. H; has in this case the Gaussian distribution and (12) is a continual 
analog of the representation (11). 

Further, in Chapter II, § 1d we considered the AR, MA, and ARMA models, in 
which the volatility an was assumed to be constant (on = o = Const), while the 
Lin were defined by the ‘past’ variables hn—1,hn—2,... and €n—1, En—2,-.+- 

Finally, in the ARCH and GARCH models we assumed that hn = on€n, while 
the volatility op, depended on the ‘past’ (see, e.g., (19) in Chapter II, § 1d). 

We point out that all these models had a single source of randomness, white 
(Gaussian) noise € = (En). 

As regards the stochastic volatility model (see Chapter II, §1d.7), it is assumed 
there that we have two sources of randomness, two independent white (Gaussian) 


. 1 
noises € = (En) and ô = (ôn) such that Ry = OnEn, where on = e3ôn 


p 
An = a0 + 5 QjÂn—i + Côn (14) 


i=1 


(see Chapter II, §3c). 

In the same way, in the continuous-time case we can consider various counter- 
parts to ARCH-GARCH models or to models of stochastic volatility kind. 

In the second case we have, e.g., models of the following form: 


dS, = Sz (u(t, St, ot) dt + or dWF), (15) 
dA; = a(t, At) dt + b(t, At) dW? (16) 


where A; = In ae while W° = (WF )r>0 and Wo = (WP eso are two independent 
Wiener processes (cf. (37) and (38) in Chapter II, §3a; models of this kind were 
considered in [235], [364], [432], and [477]). 

Using the models (15), (16) in the situation when one can observe only the pro- 
cess S = (St)¢>0 (and not the volatility o = (a4)¢>0), we arrive at an ‘incomplete’ 
market, wliere one can give uo clear definition of, say, a rational price of an option, 
so that one must resort to a more complicated analysis of upper and lower prices 
(see Chapter V, § 1b). 
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Diffusion models of ARCH-GARCH type in the continuous-time case seem to be 
more promising in this respect since one can avail of the well-developed machinery 
of an ‘arbitrage-free complete’ market. 

Dupire’s model considered above (10) is one example of such a Markovian model. 

Pursuing the idea of the ‘dependence on the past’ inherent in the ARCH- 
GARCH models it is reasonable to consider, e.g., the representations 


St = Soe (17) 
such that the diffusion process H = (Ht)¢>0 is a component of a multivariate 
diffusion process (Hz, Hi. Miss HP )is0 generated by a single Wiener process. 


A corresponding exainple is provided by a stationary Gaussian process with 
rational spectral density, which can be regarded as a component of a multivariate 
Markov process satisfying a linear system of stochastic differential equations. (See 
Theorein 15.4 and the system of equations (15.64) in [303] for details.) 


4. We continue the discussion of the diffusion models introduced above and de- 
scribing the evolution of stock prices in Chapter VII, where we shall consider these 
models in the context of an ‘arbitrage-free complete market’. 

We pointed out in Chapter I that the concept of the absence of arbitrage is pre- 
cisely the economists’ concept of efficient market, to which we stick in our analysis. 
In the discrete-time case, the First fundamental theorem of asset pricing (see Chap- 
ter V, § 2b) gives one a martingale criterion of whether a particular (B, S)-market 
is arbitrage-free. This property imposes certain constraints on both bank account 
B = (Bn)n>o0 and stock price S = (Sn)nso- 

In the continuous-time case it is also reasonable to stay within the realm of 
arbitrage-free (B,S)-markets. Again, this imposes certain constraints on both dy- 
namics of stock prices and evolution of the bank account (see Chapter VII). 

It should be noted that (compared with discrete time) the continuous-time case is 
more delicate. This is primarily due to the technical complexity of the corresponding 
machinery of stochastic calculus and the considerable gap between the potentials 
of continuous and discrete trading. 

Another property shaping the structure of a (B, S)-market is its completeness 
(or lack of it), which means that we can always make up a portfolio with value 
reproducing our ‘liabilities’ (see details in Chapter V, § 1b). 

Generally speaking, this desirable property of a market is rather an exception 
than a rule. It is remarkable, however, that we obtain this property of complete- 
ness under fairly general assumptions in the case of diffusion (B, S)-markets (see 
Chapter VII, § 4a). 


§4c. Diffusion Models of the Term Structure 
of Prices in a Family of Bonds 


1. We already discussed bonds as IOU’s and their market prices P(t, T) in general, 
in the very beginning of this book (Chapter I, § 1a). We also introduced there 
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several characteristics of bonds such as the initial price P(0,T), the face value 
P(T, T) (which we, for definiteness set to be equal to one), the current interest rate, 
the yield to maturity, and some other. We mentioned that the question on the 
structure of the prices P(t, T}, 0 < ¢ < T, regarded as a family of stochastic objects 
is, in a certain sense, more complicated than the question on the probabilistic 
structure of stock prices. 

One difficulty here is that if, for a fired T we regard P(t,T) as a stochastic 
process for 0 < ¢ < T, then, first of all, this process must be conditional because 
P(T,T) =1. 

A typical example is a Brownian bridge X = (Xt)t<r considered in § 3a and 
governed by the equation 


T-t 
with Xo = a. It has the property that X; > last Î T. 

In a similar way to Bachelier’s using a linear Brownian motion (see formula (1) 
in § 4b) to simulate stock prices, one could treat the process X = (Xij<r as a 
prospective model of the evolution of the bond prices P(t, T), t < T. 

The complications here are the same as with shares: the variables X;, t > 0, 
can in general assume any values in R, while for a bond price, by its meaning, we 
have 0 < P(t, T) < 1. 

Another complication that one meets in the construction of models of bond 
prices relates to the fact that, as a rule, there are bonds with various times of 
maturity T on the market and in investors’ portfolios. Therefore, in sound models 
of the evolution of bond prices one should not be restricted to the consideration of 
some particular value of T; rather, they must be constructed to cover some subset 
T C R} of terms containing all possible values of the maturity times of the bonds 
traded on a inarket. There should also be no opportunities for arbitrage on the 
financial market, i.e., no one should be able to buy a bond and sell it profitably 
running no risks. 


dX, = t dt + dW; 


2. In our construction of models of the term structure of the prices P(t, T}, t <T, 
of bonds with maturity date T (we shall call them T-bonds) we shall assume that 
T = R+. In other words, we assume that there exists a (continual) family of bonds 
with prices P(t, T),0 <t<7,T¢R,. 

We shall assume in addition that there are no bond interest payments (coupon 
payments), i.e., we consider only zero-coupon bonds. 


3. In dealing with a single T-bond or a family of T-bonds, T > 0, one uses several 
characteristics of bonds that shall be helpful to us over the entire process of model 
building and analyzing. 

Namely, let us represent the price P(t, T} by the following expression: 


P(E, T) =e TXT- Ger (1) 


PUT) =e" h Feds p< 7, (2) 
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with some nonnegative functions r(t, T) and f(t,s).0<t<s<T. 
Clearly, 
In P(t, T) 
T-t ’ 
and (provided that P(t, T) is differentiable with respect to T, T >t) 


r(t, T) = EST (3) 


0 
f(t,T)= oF In P(t, T), t<T. (4) 

In the case of one zero-coupon T-bond, for t < T we have defined its yield or 
the yield p(T — t, T) to the maturity time by the formula 


P(t, T) = e` (T-t)n(1+p(T-t,t)) (5) 
(see formula (6) in Chapter I, § 1a}. Comparing this with (1) we see that 
r(t, T) = In(1 + p(T — t,t)). (6) 


The quantity r(t, T} is also often called the yield of the T-bond at time t < T 
(due for the remaining time T — t}, and the function t ~> r(t, T) (t < T) is called 
the yield curve of the T-bond. 

The quantities r(t, T} are especially useful when one considers various T-bouds 
with time of maturity T > t. In this case, the function T ~> r(t, T} is also called 
the yield curve of the family of T-bonds (at time t). 

To avoid treating the cases of a single bond and a bond family separately we 
shall regard r(t, T} as a function of two variables, t and T, assuming that 0 < t < T 
and T > 0. We shall call it the yield again. 

The quantities f(t, T} are usually called the forward rates for the contract sold 
at time t. 

A key role in the subsequent analysis is played by the instantaneous rate r(t) at 
time ¢ that can be defined in terms of the forward interest rate by the equality 


r(t) = f(t,t). (7) 
We point out also the following relation between r(t, T) and f(t, T): 


fT) = rer- EED, (8) 


which is an immediate consequence of (4) and (1). 
Or(t, T) 
oT 


Hence, if, e.g., the derivative is finite, then setting T = t we obtain 


r(t) = f(t,t) =r(t,t) (= ae T)). (9) 
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4. We consider now the issue of the dynamics of the bond prices P(t, T). The two 
main approaches here are the indirect and the direct ones. (Cf. Chapter I, §1b.5.) 
Taking the first approach, one represents P(t, T} as the composite 


P(t, T) = F(t,r(®; T), (10) 


where r = (r(t)}jt>o is some instantaneous rate. 

In such models the entire term structure of prices is determined by a single 
factor, r = (r(t))ty0. For this reason, they are called single-factor models. 

An important (aud treatable by analytic means) subclass of such models is 
described by functions 


F(t, r(t); T) = eteT)—r()B(t,T) (11) 


These models are said to be affine, or, sometimes, exponential affine ([117], 
[119]) since In F(t,2;T) = a(t,T) — x@(t, T} is a linear function of x with some 
coefficients e(t, T) and f(t, T}. 

Another well-known approach was used in [219]. The corresponding models is 
called the HJM -model after its authors (D. Heath, R. Jarrow, and A. Morton). The 
idea of this approach is to seek the prices P(t, T} as solutions of certain stochastic 
differential equations of type 


dP(t, T) = P(t, T)(A(t, T) dt + B(t,T)dW;), (12) 


where A(T,T) = B(T,T) = 0 and P(T,T) = 1 (cf. formula (4) in § 4b). Alterna- 
tively, one can consider equations 


df (t, T) = a(t, T) dt + b(t, T) dW; (13) 


for the forward interest rates f(t, T). 
This is an appropriate place to recall that, throughout, we assume that we have 
some filtered probability space (stochastic basis) 


(9, F, (Fi)tz0, P). 


All the functions in question (P(t, T}, ft, T), A(t, T}, ...) are assumed to be Fi- 
measurable for ¢ < T. As usual, W = (Wi, Ft)tz0 is a standard Wiener process; 
moreover, we assume that the conditions ensuring the existence of the stochastic 
integrals in (12) and (13) and the existence of a solution to (12) are satisfied. 

By relation (2) between the prices P(t, T) and the forward interest rates f (t, T}, 
using It6’s formula and stochastic Fubini’s theorem (see Lemma 12.5 in [303] or 
[395]}, we can obtain the following relation between the coefficients of the above 
equations (sce [219]): 


(t,T) = PPCD BaT) = n (14) 
b(t, T) = oe (15) 
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Or 
eT T 2 
aED =r- | at s)jas+5( Wt.) ds) (16) 
T 
B(t,T) = -f TATA (17) 
t 


By (13) we also obtain 
of 
dr(t) = (e, t) + a(t, D) dt + b(t, t) dW:. (18) 


Remark 2. In several papers (e.g., in [38], [359], and [421]), in place of the forward 
interest rate f(t, T}, the authors consider its modification 


r(t,c) = f(t,t+2). 


They justify this by certain analytic simplification. For instance, after this change 
of variables equality (2) acquires a slightly more simple form: 


P(t, T) = exp - [enact (19) 


5. We discuss the structure of bond prices P(t, T}, forward rates f (t, T}, and instan- 
taneous r(t) in greater detail, in the context of a ‘complete arbitrage-free’ T-bond 
market, in Chapter VII. Here we point out only that, as with a (B,S)-market 
(see § 4b.4), our condition of an arbitrage-free T-bond market imposes certain con- 
straints on the structure of coefficients in (12) and (13) and on the coefficients 
a(t, T), B(t,T), and the interest rates in the affine models (11). thus distinguishing 
some natural classes of arbitrage-free models of a bond market. 


5. Semimartingale Models 


§5a. Semimartingales and Stochastic Integrals 


1. The abundance of the above-mentioned models designed for the description of 
the evolution of such financial indexes as, say, the price of a share, poses a natural 
problein of distinguishing a sufficiently general class of stochastic processes that on 
the one hand includes many of the processes occurring in these models and, on the 
other hand, yields to analytic means. 

Froin many viewpoints, one such class is that of semimartingales, i.e., of sto- 
chastic processes X = (X+)t>0 representable (not necessarily in a unique way) as 
sums 

Xi = Xo + At t+ M, (1) 
where A = (At, Ftjtz0 is a process of bounded variation (A € V) and 


M = (Mt, Ft)t>0 is a local martingale (M € Moc) both defined on some filtered 
probability space (stochastic basis) 

(Q, F, (Ft)e>0; P) 
satisfying the usval conditions, i.e., the o-algebra F is P-complete and the F, 
t > 0, must contain all the sets in F of P-probability zero (cf. §3b.3), and be right 
continuous (F; = Ftp, t > 0); see, e.g., [250]. 

We also assume that X = (Xz)i50 is a process adapted to (¥:)>0 and its 
trajectories t ~ X;(w), w € Q, are right-continuous with limits from the left In 
French literature a process of this type is called un processus cadlag—Continu A 
Droite avec des Limites A Gauche. 


2. We turn to seinimartingales for several reasons. First, this class is fairly wide: 
it contains discrete-time processes X = (Xn)n>0 (for one can associate with such 
process the continuous-time process X* = (X*);59 with X} = Xt] which clearly 
belongs to VY), diffusion processes, diffusion processes with jumps, point processes, 
processes with independent increments (with several exceptions; see [250; Chap- 
ter I, § 4c]}, and many other processes. 
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The class of semimartingales is stable with respect to many transformations: 
absolutely continuous changes of measure, time changes, localization, changes of 
filtration, and so on (see a discussion in [250; Chapter I, § 4c]). 

Second, there exists a well-developed machinery of stochastic calculus of semi- 
martingales, based on the concepts of Markov times, martingales, super- and sub- 
martingales, local martingales, optional and predictable a-algebras, and so on. 

In a certain sense, the crucial factor of the success of stochastic calculus of 
semimartingales is the fact that it is possible to define stochastic integrals with 
respect to semimartingales. 


Remark 1. One can find an absorbing concise exposition of the central ideas and 
concepts of the stochastic calculus of semimartingales and stochastic integrals with 
respect to semimartingales in the appendix to [139] written by P.-A. Meyer, a 
pioneer of modern stochastic calculus. 

An important ingredient of the concept of stochastic bases (Q, F, (Ft)z50, P), 
which are definition domains for semimartingales, is a flow of o-algebras (F;):50. 
As already mentioned in Chapter I, § 2a, this is also a key concept in financial theory: 
this is the flow of ‘information’ available on a financial market and underlying the 
traders’ decisions. 

It should be noted that, of course, one encounters processes that are not semi- 
martingales in some ‘natural’ models (from the viewpoint of financial theory). 

A typical example is a fractional Brownian motion B™ = (Bf)tpo (see § 2c) 
with an arbitrary Hurst parameter 0 < H < 1 (except for the case of H = 1/2 
corresponding to a usual Brownian motion). 


3. We proceed now to stochastic integration with respect to semimartingales, which 
nicely describes the growth of capital in self-financing strategies. 

Let X = (Xt)t50 be a semimartingale on a stochastic basis (Q, F, (Fr):50, P) 
and let & be the class of simple functions, i.e., of functions 


F(t,w) = Yolo) o (t) + X Yil) risi] () (2) 


that are linear combinations of finitely many elementary functions 
filt, w) = Yil Meri, s16) (3) 


with ¥,,-measurable random variables ¥;(w); cf. § 3c. 
As with Wiener processes (Brownian motions), a ‘natural’ way to define the 
stochastic integral I;(f) of a simple function (2) with respect to a semimartingale X 


(this integral is also denoted by (f> X», fe flow) dX, or l gf (sw) Xs) is to 


set 


f) = 2 ¥i(w)[Xsjat — Xr; atl, (4) 


where a A 6 = min(a, b). 
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Remark 2. The stochastic integral 
(fi) = Yi(w)[Xs,nte — Xrint] 


has a perfectly transparent interpretation in financial theory: if X = (Xz)zz0 is, say, 
the price of a share, and if you are buying Y;(w) shares at time r; (at the price X;,), 
then Y;(w)[Xs, — Xr;] is precisely your profit (which can also be negative) from 
selling these shares at time s;, when their price is X¢,. 

We emphasize that, we need not assume that X is a semimartingale in this 
definition of the stochastic integrals I(f) = (:(f))e>0 of simple functions; expres- 
sion (4) makes sense for any process X = (Xz)¢50.- 

However, our semimartingale assumption becomes decisive when we are going 
to extend the definition of the integral from simple functions f = f(t,w) to broader 
function classes (with preservation of its ‘natural’ properties). 

If X = (X¢t)t>0 is a Brownian motion, then, in accordance with § 3c, we can 
define the stochastic integral J;(f) for each (measurable) function f = (f(s,w))s<t, 
provided that the f(s,w) are #s-measurable and 


[ Po) ds < co (P-a.s.). (5) 
0 


The key factor here is that we can approximate such functions by simple func- 
tions (fn, n > 1) such that the corresponding integrals (J¢(fn), n > 1) converge (in 
probability, at any rate). We have denoted the corresponding limit by I:(f) and 
have called it the stochastic integral of f with respect to the Brownian motion (over 
the interval (0, t]). 

In replacing the Brownian motion by an arbitrary semimartingale, we base the 
construction of the stochastic integral [;(f) again on the approximation of f by 
simple functions (fn, n > 1) with well-defined integrals I;( fn) and the subsequent 
passage to the limit as n > oo. 

However, the problem of approximation is now more complicated, and we need to 
impose certain constraints on f adapted to the properties of the semimartingale X. 


4. As an illustration, we present several results that seem appropriate here and 
have been obtained in the case of X = M, where M = (Mi, Ft)tz0 is a square 
integrable martingale in the class H? (M € €?), i.e., a martingale with 


sup EM? < œ. (6) 
t>0 


Let (M) = ((M):, Ft)tz0 be the quadratic characteristic of martingales in #? 
(or in HZ) i.e., by definition, a predictable (see Definition 2 below) nondecreasing 


process such that the process M? — (M) = (M? — (M):, ¥t)i50 is a martingale 
(see § 5b and cf. Chapter II, § 1b). 
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The following results are well-known (see, e.g., (303; Chapter 5]). 

A) If the process (M) is absolutely continuous (P-a.s.) with respect to t, then 
the set € of simple functions is dense in the space L? of measurable functions 
f = f(t,w) such that 


Ef f7(t,w) d(M); < c. (7) 


In other words, for cach f in this space there exists a sequence of simple functions 
fn = (Jn(t,w)jtz0, w E€ Q, n > 1, such that 


75 |f(t,w) - fn(t,w)|? d(M); > 0. (8) 
0 


Note that if B = (By)ty0 is a standard Brownian motion, then its quadratic 
characteristic (B)4 is identically equal to ¢ for t > 0. 

B) If the process (M) is continuous (P-a.s.), then the set € of simple functions 
is dense in the space Í? of measurable functions f = (f(t,w))z59 such that (7) 
holds and the variables f(r(w),w) are #,-measurable for each (finite) Markov time 
T=T(w). 

C) In general, if we do not additionally ask for the regularity of the trajectories 
of (M), then the set & of simple functions is dense in the space L2 of measurable 
functions f = (f(t,w), Ft)tzo satisfying (7) and predictable in the sense that we 
explain next. 

First, let X = (Xn, F¥n)n>1 be a stochastic sequence defined on our stochastic 
basis. Then, m accordance with the standard definition (see Chapter II, § 1b), by 
the predictability of X we mean that the variables Xn are ¥,—1-measurable for all 
noi. 

In the continuous-time case, the following definition of the predictability (with 
respect to the stochastic basis (Q, F, (Ft)t>0, P)) proved to be the most suitable 
one. 

Let Y the smallest o-algebra in the space R} x Q such that if a (measurable) 
function Y = (¥(t,w))is0.weq is Ft-measurable for each t > 0 and its ¢-trajectories 
(for each fixed w € Q) are left-continuous, then the map (t,w) ~> Y (t, w) generated 
by this function is %-measurable. 


DEFINITION 1. We call the o-algebra P in R} x Q the o-algebra of predictable 
sets. 


DEFINITION 2. We say that a stochastic process X = (X;(w))¢50 defined on a sto- 
chastic basis (Q, F, (Ft)tz0, P) is predictable if the map (t,w) ~ X(t,w) (= Xt(w)) 
is Y-measurable. 
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5. The above results on the approximation of functions f enable one, by analogy 
with the case of a Brownian motion, to define the stochastic integral 


Es f * f(s,w) dM, (9) 


for each M € H? by means of an isometry. 
Then we can define the integrals I;(f) for t > 0 by the formula 


KA= fo 1s <t)f(s.0) dMs: (10) 


It should be pointed out that if we impose no regularity conditions (as in 
A) or B)) on (M), then the above discussion shows that for M € #? the sto- 
chastic integrals I(f) = ((f))ts0 are well defined for each predictable bounded 
function f. 

The next step in the extension of the definition of the stochastic integral I;(f) 
(or (f: XJ; this notation stresses the role of the process X with respect to which 
the integration is carried out) is to predictable locally bounded functions f and 
locally square integrable martingales M (martingales in the class Je qi if X is some 
class of processes, then we say that a process Y = (¥:(w)):>0 is in the class Hioc 
if there exists a sequence (Tn)n>1 of Markov times such that mn fT oo and the 
‘stopped’ processes Y7™ = (Yiar, ipo are in X for each n > 1; cf. the definition in 
Chapter II, § 1c.) 

If (Tn) is a localizing sequence (for a locally bounded function f and a martingale 

€ wz y, then, in accordance with the above construction of stochastic integrals 
for bounded functions f and M € #2, the integrals f - M™ = ((f- M” Jtjez0 are 
well-defined. Moreover, it is easy to see that the integrals for distinct n > 1 are 
compatible, i.e., (f: M™+1)™ = f- M™. 

Hence there exists a (unique, up to stochastic indistinguishability) process 
f-M = ((f-M)s)e50 such that 


(F: M)™ =f- M™ 


for each n È 1. 

The process f - M = (f - Mys, t > 0 so defined belongs to the class HBe (see 
[250; Chapter I, § 4d] for details) and is called the stochastic integral of f with 
respect to M. 


6. The final step in the construction of stochastic integrals f -X for locally bounded 
predictable functions f = f(t,w) with respect to semimartingales X = (Xt, F1)t50 
is based on the following observation concerning the structure of semimartingales. 

By definition, a semimartingale X is a process representable as (1), where 


t 
A = (At, ¥:)t>0 is some process of bounded variation, i.e., I |dAs(w)| < œ, for 
t>Oandw€Qand M = (Mi, F+) is a local martingale. 
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By an important result in general theory of martingales, each local martingale M 
has a (not unique, in general) decomposition 


Mt = Mo+ Mi +My, t>0, (11) 


where M’ = (Mj, Fijo and M” = (Mj, F)t>0 are local martingales, Mj = 
Mé = 0, and M” has bounded variation (we write this as M” € VY), while 
M’ € 462 (see [250; Chapter I, Proposition 4.17]). 

Hence each semimartingale X = (Xz, Ft}jtz0 can be represented as a sum 


Xt = Xo + Ay + Mi, (12) 


where A’ = A+ M” and M' € Ho 
For locally bounded functions f we have the well-defined Lebesgue-Stieltjes 


integrals (for each w € Q) 


ENE h {foe (13) 


and if, in addition, f is predictable, then the stochastic integrals (f -M’), are also 
well defined. Thus, we can define the integrals (f - X)z in a natural way, by setting 


(F X)e=(f- A+ Mhe (14) 


To show that this definition is consistent we must, of course, prove that such 
an integral is independent (up to stochastic equivalence) of the particular represen- 
tation (12), i.e., if, in addition, Xp = Xo + Ar + Mi with A € V and M € H2 
then 

f-A'+f-M'=f-A+f-M. (15) 


This is obvious for elementary functions f and, by linearity, holds also for simple 
functions (f € £). If f is predictable and bounded, then it can be approximated 
by simple functions fn convergent to f pointwise. Using this fact and localization 
we obtain the required property (15). 


Remark 3. As regards the details of the proofs of the above results and several 
other constructions of stochastic integrals with respect to semimartingales, includ- 
ing vector-valued ones, see, e.g., [250], [248], or [303], and also Chapter VII, § 1a 
below. 


7. We now dwell on several properties of stochastic integrals of localy bounded func- 
tions f with respect to semnimartingales (see properties (4.33)—-(4.37) in [250; Chap- 
ter IJ}: 

a) f- X is a semimartingale; 

b) the map f ~ f - X is linear; 
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c) if X is a local martingale (X € Moc), then f -X is also a local martingale; 

d) for X € Y the stochastic integral coincides with the Stieltjes integral; 

e) (f-X)o =O and f: X = f- (X - Xo); 

f) A(f - X) = fAX, where AX; = Xt — Xt- ; 

g) if g is a predictable locally bounded function, then g> (f -X) = (gf) X. 

We point out also the following result (similar to Lebesgue’s theorem on dom- 
inated convergence) concerning passages to the limit under the sign of stochastic 
integral: 

h) if gn = gn(t, w) are predictable processes convergent pointwise to g = g(t, w) 
aud if |gn(t,w)| < G(t,w), where G = (G(t,w))i50 is a locally bounded 
predictable process, then gn - X converges to g: X in measure uniformly on 
each finite interval, i.e., 


sup |gn- X- g- x| Bo. (16) 
sst 


8. Iu connection with the above construction of stochastic integrals f - X of pre- 
dictable functions f with respect to sernimartingales X there naturally arises the 
problem of their calculation using more simple procedures: e.g., based on the idea 
of Riemann integration. 

Here the case of left-continuous functions f = (f(t,))z>0 is of interest. 

To state the corresponding result we give the following definition ([250; Chap- 
ter I, § 4d]). 

For each 2 > 1 let 


T™ = {r (m), m> 1} 
be a family of Markov times 7) (m) such that 7 (m) < 7) (m +1) on the set 
{r (m) < co}. 


We shall call T™), n > 1, Riemann sequences if 
sup [r (m +IJA t- rm) A t} 30 (17) 
m 


for allt € Ry and w E Q. 
We associate now with stochastic integrals (f > X} their T) -Riemann approx- 
imants 


TOf- X} = dF T™ (m), w w) [Xm marae Xma] (18) 


It turns out that if a process f = (f (t,w), Fejzo, has left-continuous trajectories, 
then these T‘")-Riemann approximants TO) f> X) converge to f +X in measure 
uniformly on each finite time interval [0, t], t > 0, i.e., 


sup [TOF XJu - (F Xul È 0. (19) 
uct 
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The proof is fairly easy. Let 
FM (tw) =S° f(r (m),w)I{(t.w): rm) <tr m+} (20) 


m 


Then the functions Ft, w) are predictable and converge to f pointwise since f 
is left-continuous. 

Let Ky = sup |fs|. Clearly, the process K = (K, Ft) is left-continuous, locally 
s<t 


bounded, and |f] < K. 
Hence 
sup IFO) -Xju- F Xul S0 
uct 
by property (h), and to prove the required result (19) it suffices to observe that 
TM)(f-X)= f.x. 


§ 5b. Doob-Meyer Decomposition. Compensators. 
Quadratic Variation 


1. In the discrete-time case we have an important tool of ‘martingale’ analysis of 
(arbitrary!) stochastic sequences H = (Hn, Fn) with E|Hy| < œ for n > 0: the 
Doob decomposition 

Hy = Ho + An + Mn (1) 


where A = (An, Fn—1) is a predictable sequence, and M = (Mn, Fn) is a martin- 
gale (see formulas (1)-(5) Chapter II, in § 1b). 

In the same way, in the continuous-time case, we have the Doob-Meyer decompo- 
sition (of submartingales), which plays a similar role and, together with the concept 
of a stochastic integral, underlies the stochastic calculus of semimartingales. 

Let H = (Hi, Ftjtz0 be a submartingale, i.e., a stochastic process such that 
the variables H; are F;-measurable and integrable, ¢ > 0, the trajectories of H 
are cadlag (right-continuous and having limits from the left) trajectories, and (the 
submartingale property) 


E(Hi| Fs) > H; (P-as.), 8 <t. (2) 


We shall say that an arbitrary stochastic process Y = (Y2)z50 belongs to the 
Dirichlet class (D) if 


sup E{/¥r|1(I¥r| > c)} > 0 as CO, (3) 


where we take the supremum over all finite Markov times. In other words, the 
family of randoin variables 


{Y,: 7 is a finite Markov time} 


must be uniformly integrable. 
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THEOREM 1 (the Doob -Meyer decomposition). Each submartingale H in the class 
(D) has a (unique) decomposition 


H; = Ho + At + Mt, (4) 


where A = (A, F+) is an increasing predictable process with EA; < œ, t > 0, 
Ag = 0, and M = (Mi, F+) is a uniformly integrable martingale. 


It is clear from (4) that each submartingale in the class (D) is a semimartin- 
gale such that, in addition, the corresponding process A (belonging to the class ¥, 
see §5a.6) is predictable. This justifies distinguishing a subclass of special semi- 
martingales X = (Xt, ¥t)ty0 having a representation (4) with predictable process 
A= (At, Fi). 

In the discrete-time case the Doob decomposition is unique (see Chapter II, § 1b). 
Likewise, any two representations (4) with predictable processes of bounded varia- 
tion of a special semimartingale must coincide. 

It is instructive to note that each special semimartingale is in fact a difference 
of two local submartingales (or, equivalently, of two local supermartingales), 


Remark 1. The Doob: Meyer decomposition is a ‘difficult’ results in martingale 
theory. We do not present its proof here (see, e.g., [103], [248], or [303]); we outline 
instead a possible approach to the proof based on the Doob decomposition for 
discrete time. 

Let X = (Xt, Ft)t>0 be a submartingale and let x(A) = (x, gf) 
its discrete A-approximation with 


A 
ee Xitjaja and a Fijas: 


By the Doob decomposition for discrete time, 


t20 be 


x) = Xo + A) + MO, 
where 
[¢/A] 
A) = Aa = Do ED - XP yal Fep) 
w=1 


1 fle/AlA 
= ral E(X[ssajata ~ X[ssaja | F[s/aja) ds- 


Hence one could probably seek the nondecreasing predictable process A = 
(At, Ft)tz0 participating in the Doob-Meyer decomposition in the following form: 


1 t 
Az = lim = E(X — Xs | F.) ds. 
t im x f (Xsia s | Fs) ds 


Moreover, if this process is well-defined, then it remains only to prove that the 
compensated process X— A = (Xı— At, Ft)tzo is a uniformly integrable martingale. 

It is now clear why one calls the process A = (At, #:)ty0 in the Doob-Meyer 
decomposition the compensator of the submartingale X. 
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2. The Doob-Meyer decomposition has several useful consequences (see (250; Chap- 
ter I, §3b]). We point out two of them. 


COROLLARY 1. Each predictable local martingale of bounded variation X = 
(Xt, Ftjt>z0o with Xo = 0 is stochastically indistinguishable from zero. 


COROLLARY 2. Let A = (At, Ft)jtzo be a process in the class Soc, Le., locally 
integrable. Then there exists a unique (up to stochastic indistinguishability) pre- 
dictable process A= (At, Fe)eso (the compensator of A) such that A — A isa 
local martingale. If, in addition, A is a nondecreasing process, then A is also 
nondecreasing. 


3. Now, we consider the concepts of quadratic variation and quadratic covariance 
for semimartingales, which play important roles in stochastic calculus. (For exam- 
ple, these characteristics are explicitly involved in Itô’s formula presented below, 
in § 5c.) 

For the discrete-time case we introduced the quadratic variation of a martingale 
in Chapter II, §1b. This definition is extended to the case of continuous time as 
follows. 


DEFINITION 1. The quadratic covariance of two semimartingales, X and Y, is the 
process [X,Y] = (LX, Y ]:, Ft)t>0 such that 


t t 
[X,Y = XtYı — f Xs- dY; a Ys- dX; — Xoo. (5) 


(Note that the stochastic integrals in (5) are well defined because the left-continuous 
processes (X;—) and (Y;—) are locally bounded.) 


DEFINITION 2, The quadratic variation of a semimartingale X is the process 
[X, x]= ([X, Xt, Ftyt>0 such that 


t 
[X, X] = X?- 2 f X,- dX, — Xĝ. (6) 


One also writes [X] in place of [X, X] (cf. formula (10) in Chapter II, § 1b). 
We note that, by definitions (5) and (6) we immediately obtain the polarization 
formula j 
IXY] = (K+, X+¥]-[X-¥,X-Y)). (7 


The names ‘quadratic variation’ and ‘quadratic covariance’ given to [X, Y] and 
[X, X] can be explained by the following arguments. 
Let T™, n > 1, be Riemann sequences (see §5a.8) and let 


Se, t= ewan X,(7)(m)at) (Y n (m+1)at x (myat): (8) 


mM 
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Then 
sup ISP (X,Y) -[X,Y]ul 50 as n 00 (9) 
us 

for each t > 0; in particular, 
sup [SP (X, X) - [X, X]u] 50 as n> o. (10) 
us 


To prove (10) it suffices to observe that 
Su” (X,X) = X3- x9 - aT ( X- Xu (11) 
(see (6)) and that, by property (19) in § 5a, 


Tix- Kja a ©. Came. a 2] Xs- dX;. 
0 


Now, relation (9) is a consequence of (10) and polarization formula (7). 


4. By (10) we obtain that LX, X] is a nondecreasing process. Since it is cddlg, it 
follows that [X, X] € Y+, therefore (by (7)) the process [X,Y] also has bounded 
variation ([X, Y] € V}, provided that Y € V+. 

This, together with (5), proves the following result: a product of two semimartin- 
gales is itself a sernimartingale. 

Rewriting (5) as 


t t 
XY, =X% + | Hi ay, + f Ys- 4X5 + [X,Y], (12) 
0 J0 


where [X,Y] is the process defined by (7), we can regard this relation as the formula 
of integration by parts for semimartingales. Written in the differential form it is 
even more transparent: 

d(XY) = X-dY +Y_dX+d[X,Y]. (13) 

Clearly, this is essentially a semimartingale generalization of the classical rela- 
tion 

A(XnYn) = Xn-1 AYn + Yn 1 AXn + AXn AY, (14) 
for sequences X = (Xn) and Y = (Yn). 

A list of properties of the quadratic variation and the quadratic covariance under 
various assumptions about X and Y can be found, e.g., in the book [250; Chap- 
ter I, § 4e]. 

We point out only several properties, where we always assume that Y € VY+: 

a) if one of the semimartingales X and Y is continuous, then [X,Y] = 0; 
b) if X is asemimartingale and Y is a predictable process of bounded variation, 
then 


[X,Y]=AY-xX and XY=Y-X+4+X_-Y; 
c) if Y is a predictable process and X is a local martingale, then [X,Y] is a 


local martingale; 
d) A[X, Y] = AX AY. 
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5. Besides the processes [X,Y] and X, X] just defined (which are often called 
‘square brackets’) an important position in stochastic calculus is occupied by the 
processes (X,Y) and (X, X) (‘angle brackets’) that we define next. 

Let M = (Mi, Ftjtz0 be a square integrable martingale in the class J€?. Then, 
by Doob’s inequality (see, e.g., [303], and also (36) in § 3b for the case of a Brownian 
motion), 

Esup M? < 4sup EM? < œ. (15) 
t t 


Hence the process M?, which is a submartingale (by Jensen’s inequality), belongs 
to the class (D). In view of the Doob-Meyer decomposition, there exists a nonde- 
creasing predictable integrable process, which we denote by (M, M} or (M), such 
that the difference M? — (M, M) is a square integrable martingale. 

IfMe Wee os then carrying out a suitable localization we can verify that there ex- 
ists a nondecreasing predictable process (which we define again by (M, M) or (M)) 
such that M? — (M) is a locally square integrable martingale. 

For two martingales, M and N, in He their ‘angle bracket’ (M, N) is defined 
by the formula 


(M, N) = E((M +N, M + N) — (M — N, M — N)). (16) 


It is straightforward that (M, N) is a predictable process of bounded variation 
and MN — (M,N) is a local martingale. 

It follows from (5) that MN — [M,N] is also a local martingale. Hence if 
the martingales M and N are in the class H then [M, N] — (M, N) is a local 
martingale. 

In accordance with Corollary 2 in subsection 2, one calls the predictable process 
(M, N) the compensator of [M, N] and one often writes 


(M, N) = [M,N]. (17) 
Remark 2. In the discrete-time case, given two square integrable martingales, 


M = (Mn, Fn) and N = (Nn, Fn), the corresponding sequences [M, N] and 
(M, N) are defined as follows: 


[M, N]n = 5 AM; ANg (18) 
k=1 
and i 
(M,N)n = >. E(AM, AN; | Fe-1); (19) 
k=l 


where AM; = Mp — Mk—1 and AN, = Np — Nx_1 (cf. Chapter II, § 1b.) 


306 Chapter III. Stochastic Models. Continuous Time 


Relation (19) suggests the following (formal, but giving one a clear idea) repre- 
sentation for the quadratic covariance in the continuous-time case: 


(M, NJi = f E(dMs dNs IFs). (20) 


(Cf. the formula for the Az in Remark 3 at the end of subsection 1.) 


6. We now discuss the definition of ‘angle brackets’ for semimartingales and their 
connection with ‘square brackets’. 

Let X = (Xi, Ft)tz0 be a semimartingale with decomposition X = Xọo+M +A, 
where M is a local martingale (M € Moc) and A is a process of bounded variation 
(AEY). 

Besides the already used representation of the local martingale M as the sum 
M = Mo+M'+M" with M” € VN Mo, and M! € 62.0 Moc (see formula (11) 
in §5a), each local martingale M can be represented (moreover, in a unique way) 
as a sum 


M=M)+M+M4, (21) 


where M° = (My, ¥¢)z50 is a continuous local martingale and M’ = (M8, Frye>0 
is a purely discontinuous local martingale. (We say that a local martingale X is 
purely discontinuous if Xo = 0 and it is orthogonal to each continuous martingale Y, 
i.e., XY is a local martingale; see [250; Chapter I, § 4b] for detail). 

Hence each semimartingale X can be represented as the sum 


X = Xo+ M° + Mİ +A. 


Remarkably, the continuous martingale component M° of X is unambiguously 
defined (this is a consequence of the Doob-Meyer decomposition; see [250; Chap- 
ter I, 4.18 and 4.27] for a discussion), which explains why it is commonly denoted 
by X°. 

Clearly, [X°, X°] = (X°, X°); moreover, one can prove ([250; Chapter I, 4.52]) 
that 

[X, Xle = (X°, X°} + $ (AX)? (22) 
s<t 


and, for two semimartingales, X and Y, 


[X, Yh = (X, Y)t + YAX, AY;, (23) 
s<t 
where we set by definition (X,Y); = (X°, X°}. 
Remark 3. For local martingales M we have 


S-(AM.)? <œ  (P-a.s.) (24) 
sst 
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for each t > 0 and 
[M, M] € Sioc (25) 
(see, e.g., [250; Chapter I, §4]). Since processes A of bounded variation satisfy the 


inequality }> |AA,| < co (P-a.s.) for each ¢ > 0, it follows that 
st 


5O(A4,)? <œ (P-a.s.) (26) 
agt 


for the corresponding component of an arbitrary semimartingale X = Xo +M +A. 
Hence 


5O(AX.} <œ  (P-a.s.) 


s<t 


for each ¢ > 0, and the right-hand sides of (22) and (23) are well defined and finite 
(P-a.s.). 


§5c. Itô’s Formula for Semimartingales. Generalizations 
1. THEOREM (It6’s formula). Let 
SK it X$ 
be a d-dimensional semimartingale and let F = F(21,...,£q) be a C?-function 


in R¢, 
Then the process F(X) is also a semimartingale and 


F (Xt) = F(Xo) + 5\(DiF(X_)) X’ 


i<d 
1 x 
+3 SO (Dg FR) X$) 
ij<d 
+EP- ra- Erai], o 
s<t t<d 
where D;F = or D;; F = oF 
a A ðr P ðriðr; 


The proof can be found in many textbooks on stochastic calculus (e.g., [103], 
[248], or [250]). 
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2. We consider now two examples demonstrating the efficiency of It6’s formula in 
various issues of stochastic calculus. 


EXAMPLE 1 (Doléans’ equation and stochastic exponential). Let X = (Xt, F:)is0 
be a fixed semimartingale. We consider the problem of finding, in the class of cadlag 
processes (processes with right-continuous trajectories having limits from the left) 
a solution Y = (Yi, ¥t)t>0 to Doléans’s equation 


t 
n=1+ f Ys- dXs, (2) 
0 


or, in the differential form, 


dY = Y_ dX, 


5 
Il 
= 
~ 
Ww 
a 


By Itô’s formula for the two processes 
1 
Xi = X,- Xo- 3 (X*, xt 


and 


xX? = [eax ot, xg S1, (4) 
O<s<t 


and the function F(z1, £2) = e71 - £2, we obtain that the process 
E(X) = F(X}, XÊ), t20, 


i.e., the process 


E(X) = (E(X Jt Fe) izo (5) 
such that i 
O(X)p mete A a TT AA a, (6) 
0<s<t 


1) is a semimartingale; 

2) satisfies Doléans’s stochastic equation (2). 
Moreover, the process &(X) (the Doléans stochastic exponential) is a unique (up 
to stochastic distinguishability) solution of (2) in the class of processes with cadlag 
trajectories. (See the proof and a generalization to the case of complex-valued 
semimartingales in [250; Chapter I, § 4f].) 
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EXAMPLE 2 (Lévy’s theorem; §3b). Let X = (Xt, Ft)tz0 be a continuous local 
martingale with (X); = t. Then X is a Brownian motion. 


For the proof we consider the function F(x) = età? and use It6’s formula (sep- 
arately for the real and the imaginary parts). Then we obtain 


t. EN 
eòXt 4 i f eAXs dX, — E | eàs ds, (7) 
0 0 


The integral fp e!Xs dX, is a local martingale. We set tn = inf{t: |X: > n}. 
Let E(-; A) be averaging over the set A. If A € Fo, then we see from (7) that 


(inn a) a1 — el [V eX as: A 8 
(e ;A)= 5 : e sA). (8) 


Hence, as n —> 00, we obtain the following relation for f(t) = E(e*+; A), 
t20: 


A2 t 
falt)=1- F f Ads t>0; (9) 
moreover, it is clear that |f4(t)| < 1 and f4 (0) = P(A). 
The only solution of (9) satisfying these conditions is as follows: 
2 
f(A) = e7 Ft P(A). (10) 
Hence 
; 2 
E(ei*e | Fo) = Ce ae 


In a similar way we can show that for all s and t, s < t, we have 
A 2 
E(N XiX) | Fe) = e7 FE), (11) 


Consequently, X = (Xt, Ft)tz0 is a process with independent Gaussian incre- 
ments, E(X: — X.) = 0, and E(X: — Xy =¢-— s. Hence, by our assumption of the 
continuity of the trajectories X is a Brownian motion (see the definition in § 1b). 


3. Itô’s formula, which is an important tool of stochastic calculus, can be gener- 


alized in several directions: to nonsmooth functions F = F(z1,..., £q) and semi- 
martingales X = (X1,...,Xq), to smooth functions F = F(z1,..., £q) and non- 
semimartingales X = (X1,...,Xq), and so on. 


Following [166] and using some ideas from this paper we present here several 
results in this direction. 
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a) Let d = 1 and let X be a standard Brownian motion B = (Bz)ix0. Let 
F = F(z) be an absolutely continuous function, i.e., 


F(2) = F(0) + [ f(y) dy; (12) 


here we assume that the (measurable) function f = f(y) belongs to the class 
L? (R‘), i.e., 


loc 


i f(z) dx < œ (13) 


for each K > 0. 

Note that neither f(B) = (f(Bt))tx0 nor F = (F(Bt))t>0 are semimartingales 
in general, therefore we cannot use Itô’s formula in the form (1). 

However, it is proved in [166] that we have the following relation: 


t 
P(B) = FO) + | f(Bs)dBs + 51(B). Ble (14) 


where [f(B), B] is the quadratic covariance of the processes f(B) and B defined as 
follows: 


[f(B), Bl: = P- nim, Vara T FB) ) 
m 
x% r = Biata) (15) 


here the T( = {t(m), m > 1}, n > 1, are Riemannian sequences of (determin- 
istic) times t0 defined in § 5a.8. 

We emphasize that the process f(B) is not a semimartingale in general, there- 
fore, the mere existence of the limit (in probability P) in (15) is nontrivial. One of 
the results of [166] is the proof of the existence of this limit. 


COROLLARY 1. Let F(x) € C?. Then 


t 
[f(B), Ble = [ f'(Be) ds 


and (14) coincides with It6’s formula. 


COROLLARY 2. Let F(x) = |x|. Then [f(B), Ble = 2L4(0), where 


1 t 
Le(0) = lim — <e)d 
1(0) = im f I(|Bs| < €) ds (16) 
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is the local time that the Brownian motion “spends” at zero over the period (0, t]. 
Hence, from (14) we obtain Tanaka’s formula 


t 
\Bl = f Sen B, dB, + L1(0) (17) 
0 


(cf. the example in Chapter II, § 1b and § 3e in this chapter). 


Note that the process |B] = (|Bil, ¥z):50 is a submartingale, the stochastic 
integral is a martingale, and L(O) = (Lz(0), F+) is a continuous (and therefore, 
predictable) nondecreasing process. 

Hence we can regard (17) as an explicit formula for the Doob~Meyer decompo- 
sition of the semimartingale |B}. 

b) Again, let d = 1, and let X be a fractional Brownian motion BË = (BP)i50 
with Hurst parameter H € (4, 1]; see § 2c. As before, let T( = {t™(m),m > 1}, 
n > 1, be Riemann sequences of (deterministic) times t™® (m). 

Since EJA BE|? = |At|?™ (see (6) in § 2c), it follows that 


š H H 2) _ 
lim, E BETR ~ Binim)! =0, 
and therefore, for H € ($,1] we have 
. H H 2 
P- im, A Baan = Brinym)| = (18) 
m 


for the limit in probability. 


Remark. If H = 4, then the corresponding limit is equal to ¢, whereas it is +20 for 
H e (0, 3). One usually calls processes with property (18) zero-energy processes 
(see Chapter IV, §3a.6 and also, e.g., [166]). 

Let F = F(x) be a C?-function and let f(x) = F’(z). 

For a Brownian motion B = (Br)z>0 we have 


dF (B,) = F' (Bi) dBi + ZF" (Bi) (dB,)?, (19) 


which (on having agreed that (dB;)? = dt—see §3d for details) gives one the 
following /¢6’s formula written in the differential form: 


dF (Br) = F"( By) dBi + Z F"(Bi) dt. (20) 


Property (18) holds for a fractal Brownian motion BY! = (BYi>0 with param- 
eter 5 < H < 1, which makes the following agreement look fairly natural: 
(dBE)? = 0. (21) 
(Cf. (12)-(14) in §3d.) 
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If we replace B by B™ in the expansion (19), then, in view of (21), we arrive 
(only formally at this stage) at the representation (recall that f = F’) 
dF (Be) = f( Be) dBf, (22) 


which, in accordance with the standard conventions of stochastic calculus, must be 
interpreted in the integral form: for s < t we have 


FBP -FOH = f IBD ane (P-a.s.). (23) 


We shall now prove this formula restricting ourselves for simplicity to the case 
of F € C? and explaining also on the way how one should interpret the ‘stochastic 
integral’ in (23). 

By Taylor’s formula with remainder in the integral form 


T 
F(a) = F(y)+ flys — y) + J f'(u)(a — u) du. 
y 
Hence, proceeding as in [166] and [299], for each T)-partitioning, n > 1, we obtain 


F( BP) - F( Bo) = PDEA AT, = BAB esas) 


m 


= DoF (Bipatea(m4.1y) (Bereta) ~ Benet T R{”), (24) 
m 


where 
(n) BIN aS 1 H 
R = 5 í f (U) (BE (mat) T u) du. 
m tatl”) (m) 
Clearly, P( sup |f'(BE)| < 0c) = 1 and, in view of (18), 
O<uct 
(n) 1 1 pH u H 2 P 
Ry I< 2 foes IP (Ba) pall evenen i Binn) (m) | >0. (25) 
Se m 
The left-hand side of (24) is independent of n and R® i 0, therefore there 
exists 
: H H H 
lim $- F (Bins (my) (Bint (m41) T B nt (m)) (26) 
m 
(in the sense of convergence in measure), which we denote by 
t 
[ sedan: (27) 


and call the stochastic integral with respect to the fractional Brownian motion 
BE = (Bi)i>0 (H € (3, 1], f € C1). 

Simultaneously, these arguments also prove required formula (23), which can be 
regarded as an analogue of Itô’s formula for a fractional Brownian motion. 
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EXAMPLE 3. If F(x) = «?, then (23) shows that for a fractional Brownian motion 
B"! with 3 <H <1 we have 


d( BE)? = 2B# d BE, (28) 
Recall that for a Brownian motion B = B!/?, 
d( By)” = 2B, dB, + dt. (29) 
EXAMPLE 4. If F(x) = e”, then 
d(e®t) = eB dB” + sent dt. (30) 
For a Brownian motion B = B!/? we have 
d(e!) = ePt dBi + Se dt. (31) 
Our considerations could be extended also to the case of several fractional 
Brownian motions. For instance, assume that 
(Xo, Xi,- -, Xa) = (BB oR, 


where Hg = 4 (so that B™° is a standard Brownian motion) and 4 < Hı < Ho < 
-< Ha < 1. If F = F(t, £0, 21,-.., £q) € Ch??, then (cf. (11) in §3d) 


OF OF 1eFr d OF 
dF = —d dX dX)" — dX;. 
at lage OF 5 oa \ a Xi (32) 
EXAMPLE 5. We consider the process 
o2 
Som elt P )t+ooBi+0: Be (33) 


where B = B!/? is a standard Brownian motion and B= is a fractional Brownian 
motion with 4 <H< 1. Then S = (S;)z50 can be regarded as the solution of the 
stochastic differential equation 


dS; = S;(udt + 09 dB; + o1 4B;') (34) 


with So = 1. (Cf. (2) and (4) in § 4b.) 


Chapter IV. Statistical Analysis 
of Financial Data 


1. Empirical data. Probabilistic and statistical models 
of their description. Statistics of ‘ticks’ ...... 0... eee eee e eee 315 
§1a. Structural changes in financial data gathering and analysis ..... 315 
§ 1b. Geography-related features of the statistical data on exchange rates 318 
§ 1c. Description of financial indexes as stochastic processes 


with discrete intervention of chance .......... 0002 c cues 321 

§ 1d. On the statistics of ‘ticks’ 2... ee ee eee 324 

2. Statistics of one-dimensional distributions..............0 ec eee 327 
§ 2a. Statistical data discretizing ........ 0... eee eee eee 327 


§ 2b. One-dimensional distributions of the logarithms of relative 
price changes. Deviation from the Gaussian property 


and leptokurtosis of empirical densities ........... 00.0000 0- 329 

§ 2c. One-dimensional distributions of the logarithms of relative 
price changes. ‘Heavy tails’ and their statistics .............. 334 

§ 2d. One-dimensional distributions of the logarithms of relative 
price changes. Structure of the central parts of distributions .... 340 

3. Statistics of volatility, correlation dependence, 

and aftereffect in Pric€S 0.0... cece ccc cect cece cette tent eeeeneeeees 345 
§3a. Volatility. Definition and examples .......00.0 0.000 eee 345 
§ 3b. Periodicity and fractal structure of volatility in exchange rates .. 351 
§3c. Correlation properties s «ecse ee teens 354 
§3d. ‘Devolatization’. Operational time .......... 0.0000 ee eeee 358 
§3e. ‘Cluster’ phenomenon and aftereffect in prices ........ 0.000005 364 
A, Statistical R/S-analySis 00.6... ccc ccc ccc cece cn ene etn tee eeenenenes 367 
§ 4a. Sources and methods of R/S-analysis aoaaa auauua eaea 367 


§ 4b. R/S-analysis of some financial time series .........-...0005 376 


1. Empirical Data. Probabilistic and Statistical Models 
of Their Description. Statistics of ‘Ticks’ 


§ la. Structural Changes in Financial Data Gathering and Analysis 


1. Looking back to the changes in the empirical analysis of time-related financial 
data, one discovers the following features. 

In the 1970s and earlier, one mostly operated with data averaged over large 
time intervals: a year, a quarter, a month, a week. Among the most widespread 
probabilistic and statistical models (developed for the description of the behavior of 
the logarithms of financial indices) at that time (and nowadays) there were models 
of random walk type (see Chapter I, § 2a), autoregressive, and moving average 
models, their combinations, and so on (see Chapter II, § 1d). It should be noted 
that most of these models were linear. 

In the 1980s, in connection with the analysis of daily data, one saw the need 
to invoke nonlinear models; the ARCH and GARCH models and their various 
modifications (see Chapter II, §1d) are the best known examples of these. 

The analysis of interday data has become feasible in the 1990s. This is primar- 
ily a result of the general progress in computer technology and telecommunications 
accompanied by a sharp improvement (as compared with the ‘paper-based’ technol- 
ogy of data storage and processing) in the methods of the collection, registration, 
storage, and analysis of statistical information, arriving in a virtually incessant 
flow. 


2. Besides routine news coming even through daily newspapers or TV (for ex- 
ample, the exchange rates, several indexes, the opening and the closing trading 
prices of major stocks and commodities, and so on), several information agencies 
(Reuters, Telerate, Knight Ridder, Bloomberg, to mention just a few) deliver on-line 
megabytes of most diversified financial information to their customers: for instance, 
one can learn about the recent bid and ask currency prices as well as the name and 
the location of the bidder. 
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Below is an example of a message from Reuters concerning the exchange rates 
(against the US dollar) that is displayed on its customers’ monitors immediately 
after 7 h 27 GMT (see [204]): 

0727 DEM RABO RABOBANK UTR 1.6290/00 DEM _ 1.6365 1.6270 

0727 FRF BUEX UECIC PAR 5.5620/30 FRF 5.5835 5.5588 

0726 NLG RABO RABOBANK UTR 1.8233/38 NLG 1.8309 1.8220 

Here 0727 and 0726 is the time of quotation announcements; DEM, FRF, and 
NLG are the abbreviations for particular currencies (the German mark, the French 
franc, and the Dutch florin); RABO and BUEX are RABOBANK in Utrecht (UTR) 
and UECIC bank and Paris (PAR), respectively; 1.6290 is the bid price; 00 following 
this bid price indicates the ask price, 1.6300; 1.6365 and 1.6270 are the high and the 
low prices of the last 24 hours. On the third line, 0726 means that RABOBANK 
announced (or maintained) its florin quotations (1.8233/38) at 7 h 26 and nobody 
(including RABOBANK itself) has announced new quotations since. 

Choosing tọ = 7 h 27 as time zero we shall see that, say, the ask price of the 
dollar against the German mark 


DEM \° 
Cr ON (atic > to, 
i (T ); oe 
behaves as in Fig. 28. (Cf. Fig. 6 in Chapter I, § 1b.) 
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FIGURE 28. Behavior of the cross rate Sf = (r) > to 


That is, the price S? keeps its value for some time (on the interval [0, r1}}; then, 
at the instant 71, it drops (one speaks about a ‘tick’ indicating that some bank 
announced another quotation S$, at time 71), and so on. 

This raises two questions: 

(I) what can be said on the statistics of the lengths of the ‘intertick’ intervals 

(Tk+1 ~ Tk); 
(II) what can be said on the statistics of the changes in price (absolute changes 


a _ Qa : a a\? 
Segr ~ Sm, oF relative changes $3, ,/S7,)? 
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Extracting this information from the available data is the primary task of the 
statistical analysis of the exchange rates or data on other financial indexes, which 
often exhibit patterns of dynamics similar to the above one. 

Clearly, the function of such an analysis is the construction of probabilistic and 
statistical models of such processes as ask prices (82) or bid prices ($?). This is, in 
the long run, important also for understanding the evolution of financial indexes, 
the pricing mechanisms, and for predictions of the price development in the future. 

It should be noted that registering and processing statistical data, storing it in 
a form allowing an easy recovery is a difficult task, impossible without advanced 
technology. Still, it is equally clear that an access to the results of statistical analy- 
sis packed in a ready-to-use form gives one indisputable advantages in the securities 
markets as regards building an ‘efficient portfolio’, making rational investment de- 
cisions, the assessment of investment projects, securities, and so on. 


3. The almost incessant registration and processing of statistical data, which have 
been made possible by the modern technology, reveals a high-frequency pattern 
of the behavior of financial indexes, which evolve in a fairly chaotic manner. This 
pattern is not discernible after discretization (with respect to time or the phase vari- 
ables). Consequently, the appearance in financial mathematics of problems dealing 
with ‘high-frequency’ is the result of the new opportunities offered by almost contin- 
uous gathering of statistical data. It is this advanced technology of data gathering 
that enabled one to discover, besides the presence of high frequency components, 
several other peculiarities of the dynamics of financial indexes. Not plunging into 
detail here*, we point out, for instance, the nonlinear mode of the formation of 
financial indexes and the aftereffect, the capacity of many indexes, prices, and so 
on, to ‘remember’ their past. 


4. To give one a notion of the frequency of ticks in currency cross rates, develop- 
ing as in the above chart, and also an idea of the bulk of the available statistical 
information, we now cite the data of Olsen & Associates, an institute mentioned in 
the footnote on this page; see [90], [91], and also [204]. 

Over the period January 1, 1987-December 31, 1993, Reuters registered 


8 238 532 


ticks in the DEM/USD cross rate. Of these, 1 466 946 ticks occurred during one 
year, from October 1, 1992 to September 30, 1993. Over the same period, Reuters 
registered 570814 ticks of the exchange rate JPY/USD (see also the table in § 1b), 


For that, see the proceedings [393] of the HFDF-1 conference organized by the 
Research Institute for Applied Economics Olsen & Associates (March 29-31, 1995, Ziirich, 
Switzerland). The introductory talk “High Frequency Data in Financial Markets: Issues 
and Applications” by C. A.E. Goodhart and M. O’Hara provides a brilliant introduction 
into the range of problems arising in connection with ‘high frequencies’, describes new 
phenomena, peculiarities, and results, their interpretations, and suggests a program of a 
further research. 
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These are high-frequency data indeed: on a typical business day, there occurs 
on the average 4 500 ticks of the DEM/USD exchange rate and around 2 thousand 
ticks for JPY/USD. In July, 1994, the number of ticks of the DEM/USD rate was 
close to 9000, that is, 15-20 ticks per minute. (On usual days, there occur on the 
average 3—4 ticks each minute.) 

It should be noted that, of all the exchange rates, the rate DEM/USD is in 
general subject to most frequent changes. It is also worth noting that the quo- 
tations (bid and ask prices) cited by various agencies are not the real transaction 
prices. To our knowledge, the corresponding data or the statistics of the volumes 
of transactions are not easily accessible. 


5. The above example relates to currency exchange rates; however, many other 
financial indexes behave in a similar way. We can refer to [127; p. 284] for a chart 
depicting the behavior of Siemens shares on the Frankfurt Stock Exchange on March 
2, 1992, since the opening at 10 h 30. The pattern is the same as in Fig. 28: the 
price is flat for some time, then (at a random instant) it changes the value. 


6. One can find extended statistics of stock prices ‘ticks’ in [217]. 
Various information relating to the ticks in the prices of all kinds of securities 
(including shares and bonds) can be obtained, for example, from 


ISSM - the Institute for the Study of Securities Markets, 
NYSE - the New York Stock Exchange. 


Berkeley Options Data provides data on the ask and bid prices of options and on 
the instantaneous prices of CBOE (the Chicago Board of Options Exchange); in the 
Commodity Futures Trading Commission (CFTC) one can obtain data concerning 
the American futures market, while the current American stock prices can be, 
for example, obtained by e-mail, from the MIT Artificial Intelligence Laboratory 
(http: //www.stockmaster. com). 


§1b. Geography-Related Features of the Statistical Data 
on Exchange Rates 


1. By contrast to, say, bourses, which can trade in stock, bonds, futures and are 
open only on ‘trading days’, the currency exchange market, the 


FX-market 


(Foreign Exchange or Forex) has several peculiarities that one would like to consider 
more closely. 

It should be noted first of all that the FX-market is inherently international. It 
cannot be ‘localized’, has no premises (of the sort NYSE or CBOE have). Instead, 
this is an interweaving network of banks and exchange offices, equipped with modern 
communications, all over the world. 
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The FX-market operates continuously round the clock; it is more active during 
the five working days, and less active on week-ends or holidays (for example, on 
Easter Monday). 


2. Judging by the periods of different intensity of currency trade during the day, one 
usually distinguishes the following three geographic zones (indicated is Greenwich 
Mean Time, GMT): 


(1) the East-Asian, with center at Tokyo, 21:00-7:00; 
(2) the European, with center at London, 7:00-13:00; 
(3) the American, with center at New York, 13:00-20:00. 


According to [D4], the East-Asian zone includes Australia, Hong Kong, India, 
Indonesia, Japan, South Korea, Malaysia, New Zealand, and Singapore. 

The European zone covers Austria, Bahrain, Belgium, Germany, Denmark, Fin- 
land, France, Great Britain, Greece, Ireland, Italy, Israel, Jordan, Kuwait, Lux- 
embourg, the Netherlands, Norway, Saudi Arabia, South Africa, Spain, Sweden, 
Switzerland, Turkey, and United Arab Emirates. 

The American zone includes Argentina, Canada, Mexico, and the USA. 

Sometimes one distinguishes four zones in place of the three in our scheme, 
where the fourth, Pacific, zone is included in the East Asian one. 

To the already mentioned main centers of trade, Tokyo, London, and New York, 
one could add also Sydney, Hong Kong, Singapore (the East-Asian zone), Frankfurt- 
on-Main, Zurich (the European zone), and Toronto (the American zone). 

Taking the American dollar (USD) as the basis, the bulk of the trade proceeds 
between the dollar and the following ‘most important’ currencies: the German 
mark (DEM), the Japanese yen (JPY), the British pound (GBP), and the Swiss 
franc (CHF) (we use the abbreviations adopted by the International Organization 
for Standartization (IOS), code 4217). 

These ‘basic’ currencies (including, of course, the American dollar) are traded 
all over the globe. Other currencies are mostly traded in their geographic zones. 


3. To gain some impression of the ‘intensity’ of currency exchange, that is, of 
the frequency of ‘ticks’ in the cross rates on the FX-market (a global market, as 
already mentioned) one can consider the following graph (borrowed from [181]) of 
the ‘intensity’ of changes in the DEM/USD exchange rate. 

The average number of changes, ‘ticks’ occurring every 5 minutes (Fig. 29) or 
20 minutes (Fig. 30) is plotted along the vertical axis and time is plotted along the 
horizontal axis. 

In these charts one can distinguish cycles, which are clearly related to the rota- 
tion of Earth, and inhomogeneity in the number of changes (‘ticks’). Three peaks 
of activity, related to the three geographic zones, are discernible. 

The European and American peaks are fairly similar. The peak of activity in 
Europe occurs immediately after lunch, when the business day begins in America. 
The minimum of activity fall precisely at Tokyo lunch time, when it is night in 
Europe and America. 


320 Chapter IV. Statistical Analysis of Financial Data 


3 N 
2 
10 
+ EE 
Mon Tue 


FIGURE 29. Intensity of the DEM/USD exchange rate from Mon- 
day through Friday, over 5-minute intervals 
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FIGURE 30. The same as in Fig. 29, over 20-minute intervals 


We supplement Fig. 29 and 30 with the following chart from [427], depicting the 
interday intensity (the average number of ‘ticks’ occurring at each of the 24 hours) 
of the DEM/USD cross rate over the period October 5, 1992-September 26, 1993 
(the data of Reuters). 
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FIGURE 31. Average hourly number of ‘ticks’ during one day 
(24 hours) for the DEM/USD cross rate. Zero corresponds to 0h00 
GMT 
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4. The FX-market is the largest financial market. According to Bank for Interna- 
tional Settlements in Basel, Switzerland (1993), the daily turnover on this market 
was approximately $832 bn (832- 10°) in 1992! 

One of the largest databases covering the FX-market is to our knowledge the 
database of Olsen & Associates (see the footnote in §1a.4). The table based on 
its data gives one an idea of transactions on the FX-market in January 01, 1987- 
December 31, 1993. 


| Total Average daily 
Exchange rate number of ‘ticks’ naher oi ‘ticks 
; : (52 weeks in a year, 
registered in the database n ; 
5 business days in a week) 
DEM/USD 8.238.532 4500 
JPY/USD 4.230.041 2300 
GBP/USD 3.469.421 1900 
CHF/USD 3.452.559 1900 
FRF/USD 2.098.679 1150 
JPY/DEM 190.967 630 
FRF/DEM 132.089 440 
ITL/DEM 118.114 390 
GBP/DEM 96.537 320 
Ne ooo 

| NLG/DEM 20.355 70 


5. In the above description of the dynamics of exchange rates we were concerned 
only with the temporal aspect: the frequency (intensity) of changes as a function of 
time. Looking back at the classification in § 1a.2 we see that this discussion relates 
to issue (I), the statistics of ‘intertick’ intervals. However, we have said nothing so 
far about the character of these changes in the values of prices that occur at the 
instants of ticks. We turn to this question (issue (II)) in §1d, while in the next 
section we shall consider the probabilistic and statistical models coming to mind in 
a natural way in connection with the trajectory behavior of the prices (S?) and (Sf) 
(see Fig. 28). 


§1c. Description of Financial Indexes as Stochastic Processes 
with Discrete Intervention of Chance 


1. As already said, the monitor of, say, a Reuters customer interested the FX-mar- 
ket shows at each instant t two quotations: the ask price S? and the bid price S? 
of, say, the American dollar (USD) in German marks (DEM). 
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The difference S? — SP the spread, is an important characteristic of the market 
situation. It is well known that the spread is positively correlated with the volatil- 
ities of the prices (which are understood here as the usual standard deviations). 
Thus, a rise in volatility (which increases risks, due to the decreasing accuracy of 
the predictions of the price development) prompts traders to widen the spread as 
a compensation for greater risks. 


2. We now represent the prices S? and Sè in the following form: 


Se = Seet, Sp = Shet, (1) 
We also set 
Hy = 5 (HP +H), (2) 
Si = Soe™, where So = 1/S¢-S8. (3) 
Then 


Hi = Iny/S2-s? (4) 


is the logarithm of the geometric mean and 


St = /S2-s?. (5) 


It is the so-defined prices S = (S+) and their logarithms H = (H;) that one 
usually deals with in the analysis of exchange rates, so that the two really existing 
process (S#) and (S?) are reduced to a single one, (St). 

S 
The evolution of S = (St)t>0 and H = (Hi)tz0, where Hy = ln in real 
0 


(‘physical’) time ¢ can be fairly adequately (see Fig. 28 in §1a) described by sto- 
chastic processes with discrete intervention of chance: 


St = So + >) snI (te < t) (6) 
k21 
and 
Hi = X` hal (te < t), (7) 
k>1 
where 0 = ro < 71 < T2 < +++ are the successive instants of ‘ticks’, that is, the 


instants of price changes, and 


sn =ASn, hr, = AH 
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are ¥,,-measurable variables (here we set AS,, = Sry, ~ Srk = Sr, ~ Sr,_,3 the 
same holds for AH;,). 
Clearly, 
hy = AH, = ca = in(1 + =) Se in(1 + STk ) 
Srp Tk—1 Tei 


We already considered processes of types (6) and (7) in Chapter II, §1f. We 
recall several concepts relating to these important processes, which we shall use 
in the present analysis of the evolution of currency cross rates and other financial 
indexes. 

For the simplicity of notation we set 


Ek = $7, = AS. 


The probability distribution of the process S = (St)tz0 (which we denote by 
Law(S) or Law(5z,¢ > 0)) is completely defined by the joint distribution Law(r, £) 
of the sequence 


(7,€) = (Tk, Ek)kz1> 


of ‘ticks’ rg and their ‘marks’ E£. Such a sequence is commonly called a marked point 
process or a multivariate point process (see, for instance, [250; Chapter III, § 1c]). 
The name point process is usually reserved for the sequence r = (rg) alone (see 
Chapter II, § 1f and, in greater detail, [250; Chapter II, § 3c]), which is in our case 
the sequence of the instants of price jumps 79 =0 < 71 <T2 <°. 
Setting 


Ni =X I(r < t) (8) 
k21 
we obtain 
Tk = inf{t: Nz = k} (9) 


(here, as usual, rg = œ if inf{t: Ny = k} = Ø). 

The process N = (N+)zy0 is called a counting process, and formulas (8) and (9) 
establish clearly a one-to-one correspondence N <=> r. As regards distributions, 
the distribution Law(r,€) is completely defined by the conditional distributions 


TO. 5T, 
Law(rk+1 | He ae A (10) 
and 
TOs +++) Tk, Tkt+1 
ilanla E], (11) 


where ro = 0 and £o = So. 
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3. So far, nothing has been said about rą and &, as ‘random’ variables defined on 
some probability space. This question is worth a slightly more thorough discussion. 

To be able to use the well-developed machinery of stochastic calculus in our 
analysis of stochastic processes (for instance, multivariate point processes that we 
consider now) and also to take account of the flow of information determining the 
prices, assume that from the very beginning we have some filtered probability space 
(stochastic basis) 

(Q, F,(Fe)ero, P). 

Here (Ft)t>0 is a flow (filtration) of o-algebras Ft that are the ‘carriers of market- 
related information available on the time intervals [0, ¢]’. 


Once we have got this filtered probability space, it comes naturally to regard the 
Tk as random variables (rg = 7, (w)) that are Markov times with respect to (F+): 


{TSt} E Fe, t20. (12) 


In the same way we regard the €; as Fr -measurable random variables (€, = Ek(w)), 
where Fr, is the o-algebra of events observable on the interval [0, rg], that is, of 
events A € F such that 

AN {TSt} E Fi (13) 


for each t > 0. 


$ 1d. On the Statistics of ‘Ticks’ 


1. We now discuss what is known about the unconditional probability distributions 
Law(71,72,---): 
For k > 1 we set 
Ak = Tk > Tk-1) 


where 79 = 0. Clearly, the knowledge of Law(Aj, Ag,...) is equivalent to the 


knowledge of Law(71,72,...), so that we can restrict ourselves to the analysis of 
the distribution of the time intervals between ‘ticks’. 
Starting from the conjecture that the variables A1, A2,... are identically dis- 


tributed and independent (this assumption enables one to justify, on the basis of 
the Law of large numbers, the standard statistical constructions of estimators for 
parameters, distributions, and so on) one can gain a clear impression of the charac- 
ter of their probability distribution from bar charts for the empirical density p(A) 
constructed on the basis of the available statistical data. 

In [145] one can find the following results of such an analysis of 1 472240 in- 
tervals between ‘ticks’ in the DEM/USD cross rate (the data supplied by Olsen & 
Associates [221]). 

Up to a constant, we have 
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where 
Ai + 0.13 and Ag = 0.61. (2) 


(This is a result of the analysis of the logarithms In p(A) as functions of In A; the 
estimates for Aj and Az are obtained by the least squares method.) 

It is worth recalling in connection with (1) that there is one distribution, well- 
known in the mathematical statistics, whose density decreases as a power function: 
the Pareto distributton with density 


ab? s 

ENE, T 2 D, 

fav(2) = l goti (3) 
0, zr<b 


(here a > 0, b > 0; see Table 2 in Chapter III, § la). 

Note that quite often, in particular, in the finances literature, the authors mean 
by distributions of Pareto type (or simply Pareto distributions) ones whose density 
decreases as a power function at infinity. 

Following this trend we can say that, as shows (1), we have a Pareto distribution 
with exponent œ = Aq on the interval [23 s,3 m), and a Pareto distribution with 
exponent œ = Ag on the interval [3 m,3 h). 

It should be noted that, in the description of the probabilistic properties of a 
particular index, one can rarely expect (as it is often visible from the statistical 
analysis of the above kind) to make do with a single ‘standard’ distribution de- 
pendent on few unknown parameters. The reason probably lies with the fact that 
traders on the market have different goals, constraints, and react to risks in different 
ways. 


2. In general, there are no a priori reasons to assume that the variables A1, Ag,... 
are independent. Moreover, as shown by empirical analysis, the time when the next 
‘tick’ would occur essentially depends on the ‘intensity’, ‘frequency’ of ‘ticks’ in 
the past. The problem of an adequate description of the conditional distributions 
Law(A,z |Ai,...,Ax—1) is therefore timely. 

In this connection we present, following [143], an interesting model serving for 
such a description. This model is ARDM (Autoregressive Conditional Duration 
Model), and it is related to the ARCH family. 

Let Yk = we(A1,-.., Ag_1) be (sufficient) statistics such that 


P(A, < &|Aj,...,Agu1) = P(Ag < 2 | Yk) (4) 


The simplest conditional distribution for Ay, under condition Yẹ that comes to mind 
is the exponential distribution with density 


Tea 
p(A lyk) = —e *r, A20, (5) 
Vk 
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where the (random) parameters (Yg) are defined recursively; namely, 
Yk = a9 + 1 Ag_1 + GibR-1 (6) 


with Ag = 0, yo = 0, ao > 0, a1 È 0, and ĝı > 0. 
Clearly, relations (4)-(6) define the conditional distributions 


Law(A, | Ay, nS Ak—1) 


completely and the resulting variables A1, Ao,... are, in general, dependent. 
Note the following formula for the conditional expectations in this model: 


E(Ak | Ap—1,---, 41) = Yk (7) 


This first-order autoregressive model (6) can obviously be generalized to higher 
orders (see Chapter I, § 1d and [143]). 


2. Statistics of One-Dimensional Distributions 


§ 2a. Discretizing Statistical Data 


1. On gaining some ideas of the character of the ‘intensity’ (frequency) of ‘ticks’ 
(at any rate, in the example of currency cross rates) and of the one-dimensional 
distributions of the ‘intertick’ intervals (7, — Tk-1), it now seems appropriate to 
consider the statistics of price ‘changes’, that is, of the sequence (Sry ~ S7,_1)k>1 
Sry, 
Tk-1 
We point out straight away that ‘daily’, ‘weekly’, ‘monthly’,... data are different 
from the ‘interday’ ones. The former can be regarded as data received at equal, 
‘regular’, time intervals A (for instance, A can be one day or one week), while in the 
analysis of ‘interday’ statistics one must consider figures arriving in a ‘nonregular 
way’, at random instants 71,72,... with different intervals A,,A»,... between 
them (here Ak = Tk — Tk—1) 


or of related variables, for instance, hr, = ln 


This ‘lack of regularity’ brings forward certain difficulties in the application of 
the already developed methods of the statistical data analysis. For that reason, 
one usually carries out some preliminary data processing (‘discretizing’ the data, 
‘rejecting’ abnormal observations, ‘smoothing’, separating trend components, and 
so on). 

We now dwell on the methods of discretization. 

We fix a ‘reasonable’ interval A of (real, physical) time. It should not be very 
small: in the case of the statistics of exchange rates we must ensure that this interval 
is representative, that is, it contains for sure a large number of ‘ticks’ (in other 
words, A must be considerably larger than the average time between two ‘ticks’). 
Otherwise, there will be too many ‘blank spaces’ in our ‘discretized data’. 

For the ‘exchange rates’ of the ‘basic’ currencies it is recommended in [204] to 
take A not shorter than 10 minutes, which (besides the already mentioned ‘repre- 
sentativeness’) will also enable one to avoid uncertainties arising for small values 
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of A, when the amplitude of spreads and the range of changes in bid and ask prices 
are comparable. 

The simplest discretization method works as follows: having chosen A (say, 
10 m, 20 m, 24 h, ...), we replace the piecewise constant continuous-time process 
S = (St)is0, t > 0, by the sequence sô = (Sta) with discrete time tp = kA, 
k=0,1,.... 

Using another widespread method one, first, replaces the piecewise constant 
process 

St = So + Y EI (Tk < t) (1) 
k21 


by its continuous modification S = (St) obtained by linear interpolation between 
the values of (S;,): 


~ That —t t— Tz 
St = Sp, —— 4+ S ———— Th <tr F 2 
f TE Tk41— Tk TREHI TE” k N Tk+1 (2) 
4 
1.6310 + nd 
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FIGURE 32. Piecewise-constant process (S+) and its con- 
tinuous modification (S¢) 


After that, we discretize this modification S= (St) using the simplest method; 


that is, we construct the sequence $4 = (Sa), where th = kA, k = 0,1,..., and A 
is the time interval that is of importance to a particular investor or trader (one 
month, one day, 20 minutes, 5 minutes, ...). 


2. Beside discretization with respect to time, statistical data can also be quantified, 
rounded off with respect to the phase variable. Usually, this is carried out as follows. 

We choose some y > 0 and, instead of the original process S = (St)e>o, introduce 
a new one, S(y) = (Si(¥))es0, with variables 


St(y) = 7 Bi (3) 


For example, if y = 1 and S; = 10.54, then S+(1) = 10; while if y = 3, then 
S:(3) = 9. Hence it is clear that (3) corresponds to rounding off with error not 
larger than y. 


2. Statistics of One-Dimensional Distributions 329 


If y-quantization is carried out first, and A-discretization next, then we obtain 
from (S+) another sequence, S4(y) or S4(y). 

Since S:(7) > S: as y — 0, there arises the question how one can consistently 
choose the values of A and y to achieve that the amount of information contained 
in the variables S+, (y) with t = kA, k = 0,1,..., is ‘almost the same as the infor- 
mation in (S). As an initial approach it seems reasonable (following a suggestion 
of J. Jacod) to find out conditions on the convergence rates A > 0 and y > 0 
ensuring the convergence of the finite-dimensional distributions of the processes 
S4(y) and SÂ (y) to the corresponding distributions for S. 


§ 2b. One-Dimensional Distributions of the Logarithms 
of Relative Price Changes. Deviation from the Gaussian 
Property and Leptokurtosis of Empirical Densities 


1. We now consider the process of some cross rate (DEM/USD, say). Let 
S = (St)e>0, 


and let § = (Se)es0 be the continuous modification of S = (St)t>0 obtained by 
linear interpolation. 
Further, let $4 = (Sip) R50: where ty = kA be the ‘A-discretization’ of S= (Si)es0- 
We have repeatedly mentioned (see, for instance, Chapter I, § 2a.4) that, in the 
analysis of price changes, it is not their amplitudes AS, = = St, — Sip- ı themselves 


Sty 


SAEST. f A 
that are of real economic significance, but, rather, the relative changes = 


= St 
5 i ; k-t 
te 1. It is understandable for that reason that one is usually not so much 


Si, S 
interested in the distribution of St, as in the distribution of Ay, = = In a( = ). 
0 
We now set 


AY = AM, (= By — Ans): (1) 


where t = kA, k > 0, and Ho = 0. 2 
Bearing in mind our construction of S* from S and the notation (r = AH: 


and H = In s) introduced in § 2a, we obtain 
0 


zA) ~(A) 

Hoe os Reha (2) 
{i: th_1<Ti<ty} 

where the term KA reflects effects related to the end-points of the partitioning 

intervals and can be called the ‘remainder’ because it is small compared with the 

sum. 
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Indeed, if, e.g., A is 1 hour, then (see the table in §1b.4) the average number of 
‘ticks’ of the DEM/USD exchange rate is approximately 187 (= 4500 : 24). Hence 
the value of the sum in (2) is determined by the 187 values of the h,,. At the same 


(Ay is knowingly not larger than the sum of the absolute values of the four 
increments h, corresponding to the ticks occurring immediately before and after 
the instants tp 1 and ty. 

It must be pointed out that the sum in (2) is the sum of a random number 
of random variables, and it may have a fairly complicated distribution even if the 
distributions of its terms and of the random number of terms are relatively simple. 


This is a sort of a technical explanation why, as we show below, we cannot assume 


that the variables Ro 


conjecture of the Gaussian distribution of RA becomes ever more likely with the 
decrease of A, which leads to the increase in the number of summands in (2). This 
is where the influence of the Central limit theorem about sums of large numbers of 
variables becomes apparent. 
(A) 
tk p 
struction of these variables. In what follows, we shall also denote them by h, for 
simplicity (describing their construction explicitly and indicating the chosen value 
of A). 


time, |r} 


have a Gaussian distribution. True, we shall also see that the 


Remark. The notation h is fairly unwieldy, although it is indicative of the con- 


2. Thus, assume that we are given some value of A> 0. In analyzing the joint dis- 
tributions Law(h1, hg,...) of the sequence of ‘discretized’ variables h1,hg,... it is 
reasonable to start with the one-dimensional distributions, making the assumption 
that these variables are identically distributed. (The suitability of this homogene- 
ity conjecture as a first approximation is fully supported by the statistical analysis 
of many financial indexes; at any rate, for not excessively big time intervals. See 
also § 3c below for a description of another construction of the variables h; that 
takes into account the geography-related peculiarities of the interday cycle.) 

As already mentioned, there occurred 8.238.532 ‘ticks’ in the DEM/USD ex- 
change rate from January 01, 1987 through December 31, 1993 according to Olsen 
& Associates. One can get an idea of the estimates obtained for several character- 
istics of the one-dimensional distribution of the variables hy = a on the basis of 
these statistical data from the following table borrowed from [204]: 


A N Mean Variance Skewness Kurtosis 
hn Mz Sn Ky 
10m | 368.000 | —2.73-10-7 | 2.621077 | 017 | 35.11 
| in | 61.200 | -1.63-107° | 1.45-10-6 0.26 23.55 
6h 10.200 | —9.84-107® | 9.20.1078 0.24 9.44 
24h 2.100 | —4.00- 1075 | 3.81-1075 0.08 3.33 
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In this table, N is the total number of the discretization nodes t1,tg,...,tn, 
where tp = kA, 
pN N : 
hy = 7 DL hts Mp = Y (h —hy) : 
i=l i=1 
while p 
Ant. m3 
SN 3/2 
2 


is the empirical skewness, and 


N= Gye 


is the empirical kurtosis. 

The theoretical skewness Sy of the normal distribution is zero. The fact that 
the empirical coefficient S N is positive means that the empirical (and maybe, also 
the actual) distribution density is asymmetric, the left-hand side of its graph is 
steeper than its right-hand side. 

It is clear from the table that the modulus of the mean value is considerably less 
than the standard deviation, so that we can set it to be equal to zero in practice. 

The most serious argument against the conjecture of normality is, of course, 
the excessively large kurtosis, growing, as we see, with the decrease of A. Since 
this coefficient is defined in terms of the fourth moment, this also suggests that 


the distribution of the hy = a must have ‘heavy tails’, i.e., the corresponding 


density p\)(x) decreases relatively slowly as |z| —~ oo (as compared with the 
normal density). 


3. It is not the shape of the bar-charts of statistical densities alone that speaks 
for the deviation of the variables he from the normal (Gaussian) property (which 
is observable not only in the case of exchange rates, but also for other financial 
indexes, for instance, stock prices). It can be discovered also by standard statistical 
tests checking for deviations from normality, such as, for example, 

(1) the quantile method, 

(2) the x?-test, 

(3) the rank tests. 

We recall the essential features of these methods. 

The quantile method can be most readily illustrated by the QQ-plot (see Fig. 33), 
where the quantiles of the corresponding normal distribution N (u, 07) with pa- 
rameters u and g? estimated on the basis of statistical data are plotted along the 
horizontal axis and the quantiles of the empirical distribution for the he are plotted 
along the vertical axis. (The quantile Qp of order p, 0 < p < 1, of the distribution 
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FIGURE 33. QQ-quantile analysis of the DEM/USD exchange rate 
with A = 20 minutes (according to Reuters; from October 5, 1992 
through September 6, 1993; [427]). The quantiles Qp of the empirical 
distribution for the variables hk = a”, tk = kA, k = 1,2,..., are 
plotted along the vertical axis, and the quantiles Qp of the normal 
distribution are plotted along the horisontal axis 
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FIGURE 34. A typical graph of the empirical density (for the vari- 


ables hk = RD, k =1,2,...) and the graph of the corresponding 
theoretical (normal) density 


of a random variable € is, by definition, the value of z such that P(€ < x) 2 p and 
P(€22)21-p.) 

In the case of a good agreement between the empirical and the theoretical distri- 
butions the set (Qp, Qp) must be clustered close to the bisector. However, this is not 
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the case for the statistical data under consideration (exchange rates, stock prices, 
and so on). The (Qp, Qp)-graph in Fig. 33 describes the relations between the 
(normal) theoretical density and the empirical density, which we depict in Fig. 34. 


4. The use of K. Pearson’s x?-test in the role of a test of goodness-of-fit is based 
on the statistics 


k 2 
a5 (vi — npi) 
f an ) 


where the v; are the numbers of the observations falling within certain intervals J; 
k 
(5 LS R) corresponding to a particular ‘grouping’ of the data, and the p; are 
i=1 
the probabilities of hitting these intervals calculated for the theoretical distribution 
under test. 
In accordance with Pearson’s criterion, the conjecture 


Ho: the empirical data agree with the theoretical model 
is rejected with significance level a if g? > oe ae where Ne 1—g İS the Q1—a- 
quantile (that is, the quantile of order 1 ~ œ) of the x?-distribution with k — 1 


degrees of freedom. Recall that the x?-distribution with n degrees of freedom is 
the distribution of the random variable 


2 2 2 
Xn = Ep +--+ + En 


where £1,...,€ are independent standard normally distributed (W (0,1)) random 
variables. The density fn(x) of this distribution is 


l pn/2-1e—#/2 
fala) = 4 2" T(n/2) (3) 
0, z<0. 


In [127] one can find the results of statistical testing for conjecture % with 
significance level a = 1% in the case of the stock of ten major German compa- 
nies and banks (BASF, BMW, Daimler Benz, Deutsche Bank, Dresdner Bank, 
Hochst, Preussag, Siemens, Thyssen, VW). This involves the data over a period 
of three years (October 2, 1989-September 30, 1992). The calculation of %? and 
Xġ—1, 1-a (with k = 22, on the basis of n = 745 observations) shows that conjecture 


X must be rejected for all these ten companies. For instance, the values of ¥? 
(with p; = 1/k, k = 22) for BASF and Deutsche Bank are equal to 104.02 and 
88.02, respectively, while the critical value of Mey 1-a for k = 22 and a = 0.01 is 


38.93. Hence Y” is considerably larger than Roy Pesce 
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§2c. One-Dimensional Distributions of the Logarithms 
of Relative Price Changes. ‘Heavy Tails’ and Their Statistics 


1. In view of deviations from the normal property and the ‘heavy tails’ of empirical 
densities the researchers have come to a common opinion that, for instance, for the 
‘right-hand tails’ (that is, as z > +00) we must have 


P( hg.’ > a) ~ aL (a), (1) 


where the ‘tail index’ aœ = a(A) is positive and L = L(x) is a slowly varying function 
L(xy) 

Go 

tails’. 

We note that discussions of ‘heavy tails’ and kurtosis in the finances literature 
can be found in [46], [361], [419], and some other papers, and even in several pub- 
lications dating back to the 1960s (see, e.g., [150], [317]). 

There, it was pointed out that the kurtosis and the ‘heavy tails’ of a distribution 
density could occur, for instance, in the case of miztures of normal distributions. 
(On this subject, see Chapter III, § 1d, where we explain how one can obtain, for 
example, hyperbolic distributions by ‘mixing’ normal distributions with different 
variances. ) 

In several papers (see, e.g., [46] and [390]), in connection with the search of suit- 
able distributions for the a 
with density 


+ las g => oo for each y > 0}. A similar conclusion holds for ‘left-hand 


the authors propose to use Student’s ¢-distribution 


i 1 (#4) r2) T 
nosga Trta) a 


where n is an integer parameter, the ‘number of the degrees of freedom’. Clearly, 
this is a distribution of Pareto type with ‘heavy tails’. 


2. After B. Mandelbrot (see, for instance, [318]-[324]) and E. Fama ((150]), it be- 
came a fashion in the finances literature to consider models of financial indexes 
based on stable distributions (see Chapter III, § la for the detail). Such a distri- 
bution has a stability exponent œ in the interval (0,2]. If œ = 2, then the stable 
distribution is normal; for 0 < a < 2 this is a Pareto-type distribution satisfying (1), 
and the ‘tail index’ @ is just the stability exponent. 

This explains why the conjecture of a stable distribution with 0 < @ < 2 arises 
naturally in our search for the distributions of the he = no; such a distribution 
is marked by both ‘heavy tails’ and strong kurtosis, which are noticeable in the sta- 
tistical data. Another argument in favor of these distributions is the characteristic 
property of self-similarity (see Chapter III, § 2b): if X and Y are independent ran- 
dom variables having a stable distribution with stability exponent a, then their sum 
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also has a stable distribution with the same exponent (equivalently, the convolution 
of the distributions of these variables is a distribution of the same type). 

From the economic standpoint, this is a perfectly natural property of the preser- 
vation of the type of distribution under time aggregation. The fact that stable 
distributions have this property is an additional evidence in favor of their use. 

However, operating stable distributions involves considerable difficulties, brought 
on by the following factors. 

If X is a random variable with stable distribution of index a, 0 < a < 2, then 
E|X| < co only for a > 1. More generally, E|X|? < œ if and only if p <a. 

Hence the tails of a stable distribution with 0 < a < 2 are so ‘heavy’ that 
the second moment is infinite. This leads to considerable theoretical complications 
(for instance, in the analysis of the quality of various estimators or tests based on 
variance); on the other hand this is difficult to explain from the economic standpoint 
or to substantiate by facts since one usually has only limited stock of suitable 
statistical data. 

We must point out in this connection that the estimate of the actual value of 
the ‘tail index’ æ is, in general, a fairly delicate task. 

On the one hand, to get a good estimate of œ one must have access to the results 
of sufficiently many observations, in order to collect a stock of ‘extremal’ values, 
which alone are suitable for the assessment of ‘tail effects’ and the ‘tail index’. 
Unfortunately, on the other hand, the large number of ‘nonextremal’ observations 
inevitably contributes to biased estimates of the actual value of a. 


3. As seen from the properties of stable distributions, using them for the description 
of the distributions of financial indexes we cannot possibly satisfy the following three 
conditions: stability of the type of distribution under convolution, heavy tails with 
index 0 < a < 2, and the finiteness of the second moment (and, therefore, of the 
variance). 

Clearly, Pareto-type distributions with ‘tail index’ œ > 3 have finite variance. 
Although they are not preserved by convolutions, they nevertheless have an impor- 
tant property of the stability of the order of decrease of the distribution density 
under convolution. 

More precisely, this means the following. If X and Y have the same Pareto-type 
distribution with ‘tail index’ œ and are independent, then their sum X + Y also 
have a Pareto-type distribution with the same ‘tail index’ a. In this sense, we can 
say that Pareto-type distributions satisfy the required property of the stability of 
the ‘tail index’ œ under convolutions. 

It is already clear from the above why we pay so much attention to this index a, 
which determines the behavior of the distributions of the variables ne at infinity. 
We can give also an ‘econo-financial’ explanation to this interest towards a. The 
‘tail index’ indicates, in particular, the role of ‘speculators’ on the market. If a 
is large, then extraordinary oscillations of prices are rare and the market is ‘well- 
behaved’. Accepting this interpretation, a market characterized by a large value 
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of a can be considered as ‘efficient’, so that the value of œ is a measure of this 
‘efficiency’. (For a discussion devoted to this subject see, for example, [204]). 


4. We now consider the estimates for the ‘stability index’ of stable distributions 
and, in general, for the ‘tail indices’ of Pareto-type distributions. 

It should be noted from the onset that authors are not unanimous on the issue 
of the ‘real’ value of the ‘tail index’ a for particular exchange rates, stock prices, 
or other indexes. The reason, as already mentioned, lies in the complexity of the 
task of the construction of efficient estimators @y for œ (here N is the number of 
observations). Stating the problem of estimation for this parameter requires on its 
own right an accurate specification of all details of cooking the statistical ‘stock’ 


RA, a careful choice of the value of A, and so on. 


In the finances literature, one often uses the following ‘efficient’ estimators Gy 
for a, which were suggested in [152] and [153]: 


ay = 0.827 LIT OI-S gsc f< 097, (3) 
Qo.72 — Qo.28 


where Q f is the quantile of order f constructed for a sample of size N under the 
assumption that the observable variables have a symmetric stable distribution. 


If it were true that the distribution Law(hp) with he = nA) belongs to the 
class of stable distributions (with stability exponent a), then one could anticipate 
that the &y stabilize (and converge to some value a < 2) with the increase of the 
sample size N. 

However, there is no unanimity on this point, as already mentioned. Some 
authors say that their estimators for certain financial indexes duly stabilize (see, 
for instance, [88] and [474]). On the other hand, one can find in many papers 
results of statistical analysis showing that the &y do not merely show a tendency 
to growth, but approach values that are greater of equal to 2 (see, e.g., [27] and [207] 
as regards shares in the American market, or [127] as regards the stock of major 
German companies and banks). All this calls for a cautious approach towards 
the ‘stability’ conjecture, though, of course, there is no contradiction with the 
conjecture that ‘tails’ can be described in terms of Pareto-type distributions. 


5. Now, following [204], we present certain results concerning the values of the ‘tail 
(A) 


i, have 


index’ æ for the currency cross rates under the assumption that the he =h 
a Pareto-type distribution satisfying (1). 

The following values of the ‘tail index’ œ = a(A) were obtained in [204] (see the 
table on the next page). 

We now make several comments as regards this table. 

In subsection 6 below we describe estimates of a based on the data of Olsen & 
Associates (cf. §1b). For A equal to 6 hours the estimation accuracy is low, due to 


the insufficient amount of observations. 
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Mant] 10 m 30 m 1h 6h 


DEM/USD | 3.11 + 0.33 | 3.35 +0.29 | 3.50 +0.57 | 4.48 + 1.64 
JPY/USD | 3.53 + 0.21 | 3.55 + 0.47 | 3.62 +0.46 | 3.86 + 1.81 
GBP/USD | 3.44 + 0.22 | 3.52 +0.46 | 4.01 +1.09 | 6.93 + 10.79 
CHF/USD | 3.64 + 0.41 | 3.74 +0.82 | 3.84 +0.77 | 4.39 + 4.64 
FRF/USD | 3.34 + 0.22 | 3.29 + 0.47 | 3.40 + 0.69 | 4.61 + 1.21 
FRF/DEM | 3.11 + 0.41 | 2.55 +0.23 | 2.43 +0.23 | 3.544 1.42 

NLG/DEM | 3.05 + 0.27 | 2.44 +0.08 | 2.19+0.12 | 3.37 + 1.43 
ITL/DEM | 3.3140.51 | 2.93 +1.17 | 2.54 +0.49 | 2.86 + 0.98 

GBP/DEM | 3.68 + 0.35 | 3.63 + 0.42 | 4.18 + 1.67 | 3.22 +0.79 
JPY/DEM | 3.69 +0.41 | 4.18 +0.90 | 4.13 +1.05 | 4.71 +1.61 


Analyzing the figures in the table (which rest upon a large database and must 
be reliable for this reason), we can arrive at an important conclusion that the 
exchange rates of the basic currencies against the US dollar in the FX-market have 
(for A = 10 minutes) a Pareto-type distribution with ‘tail index’ œ ~ 3.5, which 
increases with the increase of A. Hence it is now very likely that the variance of 
hy = hee must be finite (a desirable property!), although we cannot say the same 
about the fourth moment responsible for the leptokurtosis of distributions. 

In another paper [91] by Olsen & Associates one can also find data on the rates 
XAU/USD and XAG/USD (XAU is gold and XAG is silver). For A = 10 minutes 
the corresponding estimates for a are 4.32 + 0.56 and 4.04 + 1.71, respectively; for 
A = 30 minutes these estimates are 3.88 + 1.04 and 3.92 + 0.73. 


6. In this subsection we merely outline the construction of estimators @ for the 
‘tail index’ œ used for the above table borrowed from [204] (we do not dwell on the 
‘bootstrap’ and ‘jackknife’ methods significant for the evaluation of the bias and 
the standard deviation of these estimators). 

We cousider the Pareto distribution with density 


ab” 
foub() a zat ’ z >b, (4) 


where fab(z) == 0 for z < b. 
For x > b we have 


In fop(z) = Na +alnb— (a + 1)lnzg. (5) 
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Hence the maximum likelihood estimator any constructed on the basis of N inde- 


pendent observations (X1,..., XN) so as to satisfy the condition 
N N 
max [] foo(Xk) = H fanol Xk), (6) 
k=1 k=1 


can be defined by the formula 


N 


eo) (7) 


Since 


a | (In x)a~ (07) dx = 3° (ż + in) 
B Q 


for a > 0 and 8 > 0, it follows that 


X; [e e] 
Eln — = ave f (ing aor dz = ae 
b b b (64 
aa 
ây a 


that is, 1/@,, is an unbiased estimator for 1/a, so that the estimator Gy for a 
has fairly good properties (of course, provided that the actual distribution is— 
exactly—a Pareto distribution with known ‘starting point’ b, rather than a Pareto- 
type distribution that has no well-defined ‘starting point’). 

There arises a natural idea (see [223]) to use (7) nevertheless for the estimate 
of a in Pareto-type distributions, replacing the unknown ‘starting point’ 6 by some 
its suitable estimate. 

For instance, we can proceed as follows. We choose a sufficiently large num- 
ber M (although it should not be too large in comparison with N) and construct 
an estimator for œ using a modification of (7) with b replaced by M and with the 
sum taken over i < N such that X; > M. 

To this end we set 


Hence 


X. 
L {i<N: XizM} P M 


FNM = , (8) 
; P t<N: Xim} IM (Xi) 
where 
ars 1 ifs >M, 
MIE o ita Me, 
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Setting 
VNM = 5 Im(Xi), 
{i<N: Xi2M} 
we can rewrite the formula for Yy 4, as follows: 


Z 1 X; 
IN.M = 5 _ 3) In vi (9) 
N,M G<N: X;>M} 


and we can consider an estimator ay M for @ such that 


1 A 
= = YN,M- (10) 
NM 
We could also proceed otherwise. Namely, we order the sample (X1, X2,..., Xn) 
to obtain another sample, (Xj, X3,...,Xjy), with Xf > X3 2- SX}. We fix 
again some integer M < N as the ‘starting point’ b and set 
r 1 x} 
1<i<M M 
Then we can consider the estimator aÑ p for œ such that 
1 * 
y NM (12) 
N,M 


The estimator ON M so obtained was first proposed by B. M. Hill [223], and it is 
usually called the Hill estimator. 

Obviously, the ‘good’ properties of this estimator depend on a right choice of M, 
the number of maximal order statistics generating the statistics an, m: However, it 
is also clear that one can hardly expect to make a ‘universal’ choice of M, suitable 
for a wide class of slowly changing functions L = L(x) describing the behavior of, 
say, the right-hand ‘tail’ in accordance with the formula 


P(X; >2)~ x °LD(z). (13) 
Usually one studies the properties of the above estimators aẸ p and @y yy for 
a particular subclass of functions L = L(x). For instance, we can assume that 
L = L(x) belongs to the subclass 
Ly = {L = L(x): L(x) = 1 +c? +o(£77), c > 0} 


with y > 0. It is shown in [223] under this assumption that if M — œ as N > œ 


so that 
M 


-2y 
N 2y+a 


> 0, 


then 
Law(VM (ay m —a)) > V(0,2"), 


ie., the estimators (aÑ, M) are asymptotically normal. 


340 Chapter IV. Statistical Analysis of Financial Data 


§ 2d. One-Dimensional Distributions of the Logarithms 
of Relative Price Changes. 
Structure of the ‘Central’ Parts of Distributions 


1. And yet, how can one, bearing in mind the properties of the distribution 
Law(hy); combine large kurtosis and ‘tail index’ œ > 2 (as in the case of currency 
exchange rates)? 

Apparently, one can hardly expect to carry this out by means of a single ‘stan- 
dard’ distribution. Taking into account the fact that the market is full of traders 
and investors with various interests and time horizons, a more attractive option 
is the one of invoking several ‘standard’ distributions, each valid im a particular 
domain of the range of the variables hz. 

Many authors and, first. of all, Mandelbrot are insistent in their advertising the 
use of stable distributions (and some their modifications) as extremely suitable for 
the ‘ central’ part of this range. (See, for instance, the monograph [352] containing 
plenty of statistical data, the theory of stable distributions and their generalizations, 
and results of statistical analysis.) 

In what follows we shall discuss the results of [330] relating to the use of stable 
laws in the description of the S&P500 Index in the corresponding ‘central’ zone. 
(To describe the ‘tails’ the authors of [330] propose to use the normal distribution; 
the underlying idea is that the shortage of suitable statistical data rules out reliable 
conclusions about the behavior of the ‘tails’. See also [464].) 

As regards other financial indexes we refer to [127], where one can find a minute 
statistical analysis of the financial characteristics of ten major German corporations 
and the conclusion that the hyperbolic distribution is extremely well suited for the 
‘central’ zone. 

In Chapter II, §1d we gave a detailed description of the class of hyperbolic 
distributions, which, together with the class of stable distributions, provides one 
with a rather rich armory of theoretical distributions. As both hyperbolic and stable 
distributions can be described by four parameters, the hopes of reaching a good 
agreement between ‘theory’ and ‘experiment’ using their combinations seem well 
based. 


2. We consider now the results of the statistical analysis of the data relating to the 
S&P500 Index that was carried out in [330]. 

The authors considered the six-year evolution of this index on NYSE (New York 
Stock Exchange) (January 1984 through December 1989). All in all, 1 447514 ticks 
were registered (the data of the Chicago Mercantile Exchange). On the average, 
ticks occurred with one-minute intervals during 1984-85 and with 15-second inter- 
vals during 1986-87. 

Since the exchange operates only at opening hours, the construction of the 
process describing the evolution of the S&P500 Index took into account only the 
‘trading time’ ¢, and the closing prices of each day were adjusted to match the 
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‘opening prices’ of the next day. 
Let S = (S+) be the resulting process. We shall now consider the changes of this 
process over some intervals of time A: 


ASt, = St, — Sty_y, (1) 


where t, = kA. The interval A will vary from 1 minute to 10° minutes (In [330], 
A assumes the following values: 1, 3,10, 32,100,316, and 1000 minutes; the number 
of ticks corresponding to A = 1 m is 493545 and it is 562 for A = 1000 im.) 

Using the notation of § 2a, one can observe that 


i>) aA 
St, zZ Stp_1€ k x Stp (1 + ne )) 


and therefore, ASh, S Sa, 
St, — Stp; are small. 

This (approximate) equality shows that for independent increments (St, —S¢4,_,) 
the distribution P( ni) < x) and the conditional distribution P(AS;, S£ | S;,_,=y) 
have roughly the same behavior. 

Considering the empirical densities P(S) (x) of the variables ASt,, th = kA, 
which are assumed to be identically distributed, the authors of [330] plot the graphs 
of the logy p(x). Schematically, they are as in Fig. 35. 


since Sty ~ Sz, and the increments ASt, = 


logio Bk (x) 


i > 
0 T 


FIGURE 35. Sketches of the graphs of logio PA (x) for two distinct values of A 


A mere visual inspection of the many graphs of logy a) (e) for various values 


of A shows that the distribution densities are fairly symmetric and are ‘melting’ 
as A grows. They decrease as x ~ -+too, but not as fast as it should be for a 
Gaussian distribution. 

The unimodality and the symmetry that are visible here, together with the char- 
acter of the decrease at infinity of the empirical densities indicate that it might be 
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reasonable to direct attention to stable symmetric distributions. We recall that the 
characteristic function (8) = Ee’’* of a stable random variable X with symmetric 
distribution is as follows (see formula (14) in Chapter II, § 1a): 


(8) = 77°, (2) 


where o > Oand0 <a < 2. Hence, once one have accepted the ‘stability’ conjecture 
one should first of all find an estimate for the parameter a. 

Stable distributions have the Pareto type. In the symmetric case (see (7) and (8) 
in Chapter III, § 1a), if 0 < a < 2, then we have 


P((X| > z)}~ čar% as z> œ 


with some constant čo, and we could find the value of a using the techniques of the 
construction of estimators described in § 2c. 

However, as rightly pointed out in [330], this method for the estimation of a is 
not very reliable due to the insufficient amount of observations; it requires many 
‘extremal’ values. For that reason, the approach chosen in [330] is a different one: 
it takes into account, by contrast, only the results of observations fitting into the 
‘central’ zone. The main idea of this approach is as follows. 

Assume that the characteristic function 


p') (6) Z Eeth A Stg 
has the following form: 
pleo) = eT, (3) 


Then the density pA) (z) of the distribution P(AS;, < x) can be represented, by 
the inversion formula, as the integral 


1 fe a 
p\) (x) => f e TAI cos Ox dð. 
T JO 


For x = 0 we obtain 


(Ao) t f7 erab gg T49) _ 
p @== yAII* gg toe (4) 


Hence 
pS (9) = nepe (0), (5) 
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Of course, we could obtain the same without resorting to the representation (3), by 
applying directly to the definition of a stable law, which requires that 


Law(ASi, + ASt, +--+ + ASen) = Law(Cn(St, — Sto)),  to=0, (6) 


and by using the equality 
Cy = ni/e@ 


(see Chapter III, § la). 
Indeed, since AS;, + AS: +--+ + AS, = Sz, — So, it follows that 


Law(St, — So) = Law(n!/2(S:, — So)). 


Hence 
pA (2) = nV ep) (an-V2), (7) 


and setting z = 0 we obtain (5). 

Relation (5) enables one to find an estimate @ of the ‘stability index’ a 
by considering the empirical densities pr) (0) with A equal to 1 minute and 
n = 1,3, 10, 32,100, 316, 1000, then passing to the logarithms, and using the least 
squares method. (This choice of n = 1,3,10,... is explained by the fact that 
the corresponding values of log) n are approximately equidistant: logio 3 = 0.477, 
logio 10 = 1, logio 32 = 1.505, ee Ri 

The value of this estimate of a obtained in [330] is 


â = 1.40 + 0.05. (8) 


We point out at the onset that there is no contradiction whatsoever between this 
result and the estimate @ ~ 3.5 of the ‘tail index’ œ in §2c.5. The point is that 
these estimates are obtained under different assumptions about the character of 
distributions. In one case we assume that the distribution has ‘stable’ type, while 
in the other—that it is of Pareto type. Moreover (and this can be important), the 
object of the first research is the currency cross rates and the object of the other is 
the S&P500 Index. Generally speaking, there are no solid grounds to assume that 
the behavior of their distributions must be similar, for the defining factors in these 
two cases are distinct (the state of the world economy in the case of exchange rates 
and the state of the American national economy in the case of the S& P500 Index). 

This idea of the different behavior of the exchange rates and financial indexes 
of S&P500 or DJIA kind is also substantiated by the results of the R/S-analysis 
described in § 4b below. 

We note also that the estimate (8) is based on the ‘central’ values, while the 
estimate @ ~ 3.5 was found on the basis of the ‘marginal’ values. Hence this 
discrepancy is just another argument in favor of the above-mentioned thesis that the 
financial indexes must be described by different ‘standard’ distributions in different 


domains of their values, and that a single ‘universal’ distribution must be hard to 
find 
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3. The conclusion that the empirical densities pA) (zr) can be well approximated 
by stable symmetric densities pr)(x) in the ‘central’ domain can be substantiated 
also by the following arguments, based on the self-similarity property. 


We consider a sample 


(ASi. ; eo AS m), OA 


ges 
of size k, with time step nA, where A is 1 minute. If we proceed now to the sample 


(n-MVAAS a EETA nTHEAS tn), 
1 k 


then this new sample must have the same distribution as 
(ASt, ASt), ti — ti-1 = A. 


Hence, in accordance with (7), the corresponding estimates of the one-dimensional 
densities of the (identically distributed) variables n-VOAS in) and AS}, must be 
‘very similar’. : 

A chart in [330] obtained by a ‘superposition’ (see Chapter III, § 2c.6 for infor- 
mation about this method) of the so transformed empirical densities for the values 
of A equal to 1, 3, 10, 32, 100, 316, and 1000 minutes strongly supports the stable 
distribution conjecture (with œ = 1.40). 

One can find an estimate of the coefficient y in (3) on the basis of the empirical 
density p')(0) and the estimate @ = 1.40, by using (4). The corresponding value 
is ¥ = 0.00375. 


3. Statistics of Volatility, Correlation Dependence, 
and Aftereffect in Prices 


§ 3a. Volatility. Definition and Examples 


1. Arguably, no concept in financial mathematics if as loosely interpreted and as 
widely discussed as ‘volatility’. A synonym to ‘changeability’,? ‘volatility’ has many 
definitions, and is used to denote various measures of changeability. 

If Sn = Soe” with Hp = 0 and AH, = cen. n > 1, where (En) is Gaussian 
white noise (€n ~ 4(0,1)), then one means by volatility the natural measure of 
uncertainty and changeability, the standard deviation o. 


We recall that if € ~ N (1,07) for a random variable €, then 
P(E — y| < a) = 0.68 (1) 


and 
P(|€ — u| < 1.650) ~ 0.90. (2) 


Hence one can expect in approximately 90% cases that the result of an observation 
of £ deviates from the mean value y by 1.650 at most. 
In employing the scheme ‘Sp = Sp— e"’ one usually deals with small values of 
the An, so that 
Sn S Sn—1(1 + hn). 


Hence, if hn = cen, then, given the ‘today’ value of some price S,_ 1 we can say 
that its ‘tomorrow’ value Sp will in 90% cases lie in the interval 


[Sp—1(1 — 1.650), S,-1(1 + 1.650)], 
b«Random House Webster’s concise dictionary” (Random House, New York, 1993) 


gives the following explanation of the adjective volatile: “1. evaporating rapidly. 2. tending 
or threatening to erupt in violence: explosive. 3. changeable; unstable.” 
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so that it is only in 5% cases that Sn is larger than S,_1(1+1.650) and it is smaller 
than Sn-1(1 — 1.650) in 5% cases. 


Remark 1. This explains why, in some handbooks of finance (see, for instance, [404]), 
one measures volatility in terms of v = 1.650 in place of the standard deviation ø. 


2. We have already seen before that the model ‘hn = cen, n > 1’ is fairly distant 
from real life. It would be more realistic to use conditional Gaussian models of the 
kind ‘hn = On€n, n > 1’, with a random sequence o = (on)n>1 Of Fp~1-measurable 
variables on and with ¥,-measurable en, where (Fn) is the flow of ‘information’ 
(on the price values, say; see Chapter I, § 2a for greater detail). 

There is an established tradition of calling o = (on)n>1 (in the above model) 
the sequence of ‘volatilities’. The random nature of the latter can be reflected by 
saying that ‘the volatility is volatile on its own’. 

Note that 

E(h2 | ¥n—1) = on, (3) 


and the sequence H = (Hn, Fn)nz1 of the variables Hn = hy +--+ + hn, where 
Elhn|? < œ for n > 1, is a square integrable martingale with quadratic character- 
istic 


(H)n = SL E(RE|Fe-1), nL (4) 
k=1 


In view of (3), 
n 
(Hjn = o}; (5) 
k=1 
therefore it is natural to call the quadratic characteristic 
(H) = ((H)n, Fn)n>1 


the volatility of the sequence H. 
We note that 


EH? = E(H)n. (6) 
3. For ARCH (p) models we have 
p 
o =ag+ > aih; (7) 
i=1 


(see Chapter II, § 3a). 
Hence the estimation problem for the volatilities a, reduces to a problem of 
parametric estimation for ag, Q1,...,Q%p. 
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There exist also other, e.g., nonparametric, methods for obtaining estimates of 
volatility. For instance, if hn = un +On€n for n > 1, where p = (un) and o = (on) 
are stationary sequences, then a standard estimator for oy is 


n 


N Gag a (8) 


n-1l 
k=1 


9) 
3 
| 


= 1 a2 
where hn = — X hp. 
p= 


It is worth noting that one can also regard the empiric volatility ¢ = (@n)n>1 as 
an index of financial statistics and to analyze it using the same methods and tools 
as in the case of the prices S = (Sn)n>1. 

To this end we consider the variables 


A T 
Fn = n =~, n > 2. (9) 
On-1 


Many authors and numerous observations (see, for example, [386; Chapter 10]) 
demonstrate that the values of the ‘logarithmic returns’ 7 = (7n)n>2 oscillate rather 
swiftly, which indicates that the variables rn and Fn+1, n > 2, are negatively corre- 
lated. If we consider the example of the S&P500 Index and apply R/S-analysis to 
the corresponding values of F = (7n)n>2 (see Chapter III, § 2a and § 4 of the present 
chapter), then the results will fully confirm this phenomenon of negative correla- 
tion (see [386; Chapter 10]). Moreover, we can assume in the first approximation 
that the fn are Gaussian variables, so that this negative correlation (coupled with 
the property of self-similarity standing out in observations) can be treated as an 
argument in favor of the thesis that this sequence is a fractional noise with Hurst 
parameter H < 1/2. (According to [386], H ~ 0.31 for the S&P500 Index.) 


4. The well-known paper [44] (1973) of Black and Scholes made a significantly 
contribution to the understanding of the importance of the concept of volatility. 
This paper contains a formula for the fair (rational) price Cr of a standard call 
option (see Chapter I, §1b). By this formula, the value of Cr is independent 
of u (surprisingly, at the first glance), but depends on the value of the volatility o 
participating in the formula describing the evolution of stock prices S = (S¢)¢30: 


2 
o 
= Soe, Hı = oWi + (x a a (10) 
where W = (W;)50 is a standard Wiener process. 
Of course, the assumption made in this model that the volatility ø in (10) is, 
first, a constant and, second, a known constant, is rather fantastic. Clearly, practical 
applications of the Black-Scholes formula require one to have at least a rough idea 
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of the possible value of the volatility; this is necessary not only to find out the fair 
option prices, but also to assess the risks resulting from some or other decisions in 
models with prices described by formulas (9) and (10) in Chapter I, § 1b. 

Tn this connection, we must discuss another (empirical) approach to the concept 
of ‘volatility’ that uses the Black-Scholes formula and the real option prices on 
securities markets. 

For this definition, let C; = C;(a;7) be the (theoretical) value of the ask price 
at time t < T of a standard European call-option with fr = (Sr — K)* and with 
maturity time T. 

The price Cz is theoretical. What we know in practice is the price Cı that was 
actually announced at the instant ¢, and we can ask for the root of the equation 


Ĉi = Celo; T). (11) 


This value of ø, denoted by 6}, is called the ‘implied volatility’; it is considered to 
be a good estimate for the ‘actual’ volatility. 

It should be noted that, as regards its behavior, the ‘implied volatility’ is similar 
to the ‘empirical volatility’ defined (in the continuous-time case) by formulae of 
type (8). Its negative correlation and fractal structure are rather clearly visible 
(see, i.g., [386; Chapter 10}). 


5. We now discuss another approach to the definition of volatility, based on the 
consideration of the variation-related characteristics of the process H = (Ht)i>0 
defining the prices S = (S¢)¢50 by the formula S; = Soc. The results of many 
statistical observations and economic arguments support the thesis that the pro- 
cesses H = (H)450 have a property of self-similarity, which means, in particular, 
that the distributions of the variables Hy, a — H; with distinct values of A > 0 are 
similar in certain respects (see Chapter III, § 2). 
We recall that if H = By is a fractional Brownian motion, then 


EHA ~ Hi| = jz Ai (12) 
for all A > 0 and ¢ > 0 and 
ElAita — Hil? = A. (13) 
For a strictly a-stable Lévy motion with 0 < @ < 2 we have 
El|Hi+a — Hi | = E|Ha| = ACEH]. (14) 
Hence, setting H = 1/a < 1 we obtain 
EM +a ~ Hil = APE|Hi|, (15) 


which resembles formula (12) for a fractional Brownian motion. 
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All these formulas, together with arguments based on the Law of large numbers, 
suggest that it would be reasonable to introduce certain variation-related char- 
acteristics and to use them for testing the statistical hypothesis that the process 
H = (H)t>0 generating the prices S = (Sz)zp0 is a self-similar process of the same 
kind as a fractional Brownian motion or an a-stable Lévy motion. 

It should be also pointed out that, from the standpoint of statistical analy- 
sis, different kinds of investors are interested in different time intervals and have 
different time horizons. 

For instance, the short-term investors are eager to know the values of the prices 
S = (Si)ep0 at times tp = kA, k > 0, with small A > 0 (several minutes or even 
seconds). Such data are of little interest to long-term investors; what they value 
most are data on the price movements over large time intervals (months and even 
years), information on cycles (periodic or aperiodic) and their duration, information 
on trend phenomena, and so on. 

Bearing this in mind, we shall explicitly indicate in what follows the chosen time 
interval A (taken as a unit of time, the ‘characteristic’ time measure of an investor) 
and also the interval (a, 6] on which we study the evolution and the ‘changeability’ 
of the financial index in question. 


6. On can get a satisfactory understanding of the changeability of a process H = 
(Ht)z>0 on the time interval (a, 6] from the A-variation 


Varga (H; A) = J Hi ~ Heys), (16) 


where the sum is taken over all k such that a < tk—1 < tk < b, and tk = kA. 
Clearly, if a particular trajectory of H = (Ht)actcp is ‘sufficiently regular’ and 
A > 0 is small, then the value of Var, b] (Ħ; A) is close to the variation 


b 
Varay(H)= f dts, (17) 
a 
which is by definition the supremum 


sup >> [His ~ Hipi l (18) 


taken over all finite partitionings (to, ...,tn) of the interval (a,b] such that a = 
to <t < <tn Sb. 

In the statistical analysis of the processes H = (Ht)tz0 with presumably ho- 
mogeneous increments, it is reasonable to consider, in place of the A-variations 
Var (ao (H; A), the normalized quantities 


Var (a b] (H; A) 
ke 


which we shall call the (empirical) A-volatilities on (a, b]. 


Va bH; A) = (19) 
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It is often useful to consider the following A-volatility of order ô > 0: 


Var? (H; A) 
(6) (abi 
Via A [eze] ? (20) 
“a” 
where 6) 
ô 
Varaa (H: A) = XO [He — Hipi l (21) 
and the summation proceeds as in (16). 
We note that for a fractional Brownian motion H = By we have 
œœ, 0< H< i, 
var?) (HA) È? (6 H=1 22 
ar (apy ;A)— 4 (6-4), == 55 (22) 
0, A<H<l, 
as A > 0, where «Po» is convergence in probability. 
If H is a strictly a-stable Lévy motion with 0 <a < 2, then 
2 P 
Var (H; A) 50 (23) 


as A + 0. 


Remark 2. One usually calls stochastic processes H = (Ht)t>0 with property (23) 
zero-energy processes (see, i.g., [166]). Thus, it follows from (22) and (23) that both 
fractional Brownian motion with 1/2 < H < 1 and strictly a-stable Lévy processes 
with H = 1/a > 1/2 are zero-energy processes. 


7. The statistical analysis of volatility by means of R/S-analysis discussed below 
(see §4) enables one to discover several remarkable and unexpected properties, 
which provide one with tools for the verification of some or other conjectures con- 
cerning the space-time structure of the processes H = (Hz):50 (for models with 
continuous time) and H = (Hn)n>o (in the discrete-time case). For instance, one 
must definitely discard the conjecture of the independence of the variables hn, 
n > 1 (generating the sequence H = (Hy)n>0), for many financial indexes. (In the 
continuous-time case one must accordingly discard the conjecture that H = (Ht)is0 
is a process with independent increments.) 

Simultaneously, the analysis of A-volatility and R/S-statistics support the thesis 
that the variables hn, n > 1, are in fact characterized by a rather strong afteref- 
fect, and this allows one to cherish hopes of a ‘nontrivial’ prediction of the price 
developmient. 

The fractal structure in volatilities can be exposed for many financial indexes 
(stock and bond prices, DJIA, the S&P500 Index, and so on). It is most clearly 
visible in currency exchange rates. We discuss this subject in the next section. 
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§ 3b. Periodicity and Fractal Structure of Volatility in Exchange Rates 


1. In §1b we presented the statistics (see Fig. 29 and 30) of the number of ticks 
occurring during a day or a week. They unambiguously indicate 


the interday inhomogeneity 


and 
the presence of daily cycles (periodicity). 


S 
Representing the process H = (Hijtz0 with Ay = In = by the formula 
0 


Hy =X hal (t < t) (1) 


(cf. formula (7) in § 1c), we can say that Fig. 29 and Fig. 30 depict only the part of 
the development due to the ‘time-related’ components of H, the instants of ticks Tp, 
but give no insight in the structure of the ‘phase’ component, the sequence (h-,) 
or the sequence of hg = a (see the notation in § 2b). 

The above-introduced concept of A-volatility, based on the A-variation 
Var(a,p](H; A), enables one to gain a distinct notion of the ‘intensity’ of change in 
the processes H and H both with respect to the time and the phase variables. 

To this end, we consider the A-volatility v, a (F A) on the interval (a, b]. 


We note on the onset that if a = (k — 1)}A and b = kA, then 
YR—1)d ka]; A) = |Hka ~ Hik-1)al = jhk] (2) 
(see the notation in § 2b). 
We choose as the object of our study the DEM/USD exchange rate, so that 
S 
Sı = (DEM/USD); and Hy = In =., 


So 
We set A to be equal to 1 hour and 


t=1,2,...,24 (hours) 
in the case of the analysis of the ‘24-hour cycle’, and we set 
t=1,2,...,168 (hours) 


in the case of the ‘week cycle’ (the clock is set going on Monday, 0:00 GMT, so that 
t = 168 corresponds to the end of the week). 

The impressive database of Olsen & Associates enables one to obtain quite reli- 
able estimates (4 1)a,kA](H; A) = [hg] for the values of Y(k—-1) A kA] (A; A) = hg | 
for each day of the week. 
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To this end let time zcro be 0:00 GMT of the first Monday covered by the 
database. If A = 1 h, then setting k = 1,2,...,24, we obtain the intervals 
(0, 1], (1, 2], ..., (23, 24] corresponding to the intervals (of GMT) 


(0:00, 1:00], (1:00, 2:00], ..., (23:00, 24:00]. 


As an estimate for Pekia ra] E A) we take the arithmetic mean of the quan- 


tities EIA = A calculated for all Mondays in the database (indexed by the 
integer 7). In a similar way we obtain estimates of the Vep—a,nalet A) for 
Tuesdays (k = 25,...,48),..., and for Sundays (k = 145,..., 168). 

The following charts (Fig. 36 and Fig. 37) from [427] are good illustrations of 
the interday inhomogeneity and of the daily cycles visible all over the week in the 
behavior of the A-volatility ((,—1)a ka] (H; A) = |x| calculated for the one-hour 
intervals ((k — 1)A, kA], k = 1,2,.... 
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FIGURE 36. A-volatility of the DEM/USD cross rate during one 
day (A = 1 hour), according to Reuters (05.10.1992-26.09.1993) 


The above-mentioned daily periodicity of the A-volatility is also revealed by the 
analysis of its correlation properties. We devote the next section to this issue and 
to the discussion of the practical recommendations following from the statistical 
analysis of A-volatility. 


2. We discuss now the properties of the A-volatility (A) = vo, (H; A) regarded 
as a function of A for fixed t. We shall denote by %(A) its estimator Do, (H; A). 

Assume that t is sufficiently large, for instance, t = T = one year. We shall now 
evaluate Pr(A) for various A. No so long ago (see, i.g., [204], [362], [386], or [427]), 
the following remarkable property of the FX-market (and some other markets) was 
discovered: the behavior of the A-volatility is highly regular; namely, 


Dy (A) ~ Cra" (3) 
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FIGURE 37. A-volatility of the DEM/USD cross rate during one 
weck (A = 1 hour), according to Reuters (05.10.1992-26.09.1993). 
The intervals (0,1],...,(167,168] correspond to the intervals 
(0:00, 1:00}, . . . , (23:00, 24:00} of GMT 
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FIGURE 38. On the fractal structure of the A-volatility Pp (A). 

The values of ln r(A) as a function of ln A are plotted along 
the vertical axis 


with certain constant Cr dependent on the currencies in question and with H ~ 
0.585 for the basic currencies. 

To give (3) a more precise form we consider now the statistical data concerning 
log r(A) as a function of In A with A ranging over a wide interval, from 10 minutes 
(= 600 seconds) to 2 months (= 2 x 30 x 24 x 60 x 60 = 5 184000 seconds). 
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The chart in Fig. 38, which is constructed using the least squares method, shows 
that the empirical data cluster nicely along the straight line with slope H © 0.585. 
Hence we can conclude from (3) that, for ¢ large, the volatility r(A) regarded as a 
function of A has fractal structure with Hurst exponent H ©& 0.585. 

As shown in §3a, we have E|Ha| = f2/n All? for a Brownian motion H = 
(At)iso, E\H al = 2/7 A" for a fractional Brownian motion H = (Hz)t>0 with 
exponent H, and E|H,a| = E|H,|A™ with H = 1/a < 1 for a strictly a-stable Lévy 
process with a > 1. 

Thus, the experimentally obtained value H = 0.585 > 1/2 supports the conjec- 
ture that the process H = (H+), t > 0, can be satisfactory described either by a frac- 
tional Brownian motion or by an a-stable Lévy process with œ = H x 7 L ~ 1.7. 
Remark. As regards the estimates for H in the case of a fractional Brownian motion, 
see Chapter III, § 2c.6. 


3. We now turn to Fig. 36. In it, the periods of maximum and minimum activity 
are clearly visible: 4:00 GMT (the minimum) corresponds to lunch time in Tokyo, 
Sydney, Singapore, and Hong Kong, when the life in the FX-market comes to a 
standstill. (This is nighttime in Europe and America). We have already pointed 
out that the maximum activity (= 15:00) corresponds to time after lunch in Europe 
and the beginning of the business day in America. 

The daily activity patterns during the five working days (Monday through Fri- 
day) are rather similar. Activity fades significantly on week-ends. On Saturday 
and most part of Sunday it is almost nonexistent. On Sunday evening, when the 
East Asian market begins its business day, activity starts to grow. 


§3c. Correlation Properties 


1. Again, we consider the DEM/USD exchange rate, which (as already pointed out 
in § 1a.4) is featured by high intensity of ticks (on the average, 3—4 ticks per minute 
on usual days and 15-20 ticks per minute on days of higher activity, as in July, 
1994). 

The above-described phenomena of periodicity in the occurrences of ticks and in 
A-volatility are visible also in the correlation analysis of the absolute values of the 
changes |AH|. We present the corresponding results below, in subsection 3, while 
we start with the correlation analysis of the values of AH themselves. 


2. Let S& = (DEM/USD); and let Hy = In = We denote the results of an 
0 ~ 


~ ~ S 
appropriate linear interpolation (see § 2b) by S; and H; = ln a respectively. 
0 


We choose a time interval A; let 


hy = He, — Hip; 
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with tk = kA. (In § 2b we also used the notation nO), in accordance with § 3b, 
[hk] = Virit]: A)-) 

Let A be 1 minute and let k = 1,2,...,60. Then hi, ho, i ., heo is the sequence 
of consecutive (one-minute) increments oF H over the period of one hour. We 
can assume that the sequence ha, ho, a . heo is stationary (homogeneous) on this 
interval. 

Traditionally, as a measure of the correlation dependence of stationary sequences 
h= (hy, hg,...), one takes their correlation function 


Ehnhysk — Ehn Eine 


(1) 
V Dhn * Din +k 


(the autocorrelation function in the theory of stochastic processes). 

The corresponding statistical analysis has been carried out by Olsen & Asso- 
ciates for the data in their (rather representative) database covering the period 
from January 05, 1987 through January 05, 1993 (see [204]). Using its results one 
can plot the following graph of the (empirical) autocorrelation function p(k): 
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FIGURE 39. Empirical autocorrelation function p(k) 
for the sequence of increments Ain = = At, — At, —ı COT- 
responding to the DEM/USD cross rate, with tn = nâ 
and A = 1 minute 


In Fig. 39 one clearly sees a negative correlation on the interval of approximately 
4 minutes (P (1) < 0, P (2) < 0, 2 (3) < 0, P (4) < 0), while most of the values of the 
p(k) with 4 < k < 60 are close to zero. 

Bearing this observation in mind, we can assume that the variables hn and hm, 
are virtually uncorrelated for |n — m| > 4. 

We note that the phenomenon of negative correlation on small intervals 
(|n — m| < 4 minutes) was mentioned for the first time in [189] and [191]; it has 
been noticed for many financial indexes (see, for instance, [145] and [192]). 
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There exist various explanations in the literature of this negative correlation of 
the increments AH on small intervals of time. For instance, the discussion in [204] 
essentially reduces to the observation that the traders in the FX-market are not 
uniform, their interests can ‘point to different directions’, and they can interpret 
available information in different ways. Traders often widen or narrow the spread 
when they are given instructions to ‘get the market out of balance’. Moreover, 
many banks overstate their spreads systematically (see [192] in this connection). 

A possible ‘mathematical’ explanation of the phenomenon of negative values of 
Cov(hin. Bath) = = Elinhnsk— Ehn Ehnik for small k can be, e.g., as follows (cf. [481]). 


Let Hy = hit: a hn with hn = = ln +0nEn, where the on are Fn—1-measurable 
and (En) is a sequence of independent identically distributed random variables. We 
can also assume that the un are ¥,_1-measurable variables. Judging by a large 
amount of statistical data, the ‘mean values’ un are much smaller than an (see, e.g., 
the table in § 2b.2) and can be set to be equal to zero for all practical purposes. 


i S ; ; : : 
The values of Hn = In = are in practice not always known precisely; it would be 
0 


more realistic to assume that what we know are the values of Hy = Hy +6n, where 
(dn) is white noise, the noise component related to inaccuracies in our knowledge 
about the actual values of prices, rather to these values themselves. 

We assume that (dn) is a sequence of independent random variables with Eé, = 0 


and Esp = = C > 0. Then, considering the sequence h = (Pn) of the variables 
m= = AH, = = Tn + (dn — dn—1) we obtain 
Ely =0, Eh = Eo? +2C 


and 


Eaha e kn a eG 
hain +k = 0, k>1. 
Hence the covariance function 
Cov(Iin, hnk) = Ening k E Ehn : EPnak 
(provided that Eo? = Eo?, n > 1) can be described by the formula 


ae Eo? +2C, k=0, 
Cov(hn, nyk) = & —C, k=1, 
0, k>1. 
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3. To reveal cycles in volatilities by means of correlation analysis we proceed as 


follows. 
We fix the interval A equal to 20 minutes. Let tọ = 0 correspond to Monday, 


0:00 GMT, lct ty) = A = 20 m, tg = 2A = 40 m. t3 = 3A = 1 h, ..., t504 = 
504 A = 1 week, ..., t2016 = 2016 A = 4 weeks (= 1 month). 
0.4 


0.3 1 


0.2 


—0.1 
0 504 1008 1512 2016 


Ficure 40. Empirical autocorrelation function R(k) for the sequence 
inn | = (Hin — At, | corresponding to the DEM/USD cross rate (the 
data of Reuters; October 10, 1992-September 26, 1993; [90], [204]). 
The value k = 504 corresponds to 1 week and k = 2016 to 4 weeks 


We set An = Hi, — Hina; let 


Elħnl|n+rl — Elhn] - Elh 
y Dihnl - Dihn+rl 
be the autocorrelation function of the sequence |h| = (hail, hoh.) 


The graph of the corresponding empirical autocorrelation function R(k) for 
k = 0,1,...,.2016 (that is, over the period of four weeks) is plotted in Fig. 40. 
One clearly sees in it a periodic component in the autocorrelation function of the 
A-volatility of the sequence |A] = (|An|),,, with [An| = |Ha, E 7 |, A=tn-tn-1. 
As is known, to demonstrate the full strength of the correlation methods one 
requires that the sequence in question be stationary. We see, however, that A-vola- 
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tility does not have this property. Hence there arises the natural desire to ‘flatten’ 
it in one or another way, making it a stationary, homogeneous sequence. 

This procedure of ‘flattening’ the volatility is called ‘devolatization’. We discuss 
it in the next section, where we are paying most attention to the concept of ‘change 
of time’, well-known in the theory of random processes, and the idea of operational 
‘@-time’, which Olsen & Associates use methodically (see [90], [204], and [362]) in 
their analysis of the data relating to the FX-market. 


§ 3d. ‘Devolatization’. Operational Time 


1. We start from the following example, which is a good illustration of the main 
stages of ‘devolatization’, a procedure of ‘flattening’ the volatilities. 
t 
Let Hy = fo dB,, where B = (B:)ts0 is a standard Brownian motion 
and o = (a(t))ty0 is some deterministic function (the ‘activity’) characterizing the 
‘contribution’ of the dB,, u < t, to the value of Hı. We note that 
n 


d 
hn = Hn — Hn-1 = Í a(u) dBu = OnEn (1) 


n—l1 


for each n > 1, where en ~ V(0,1), 02 = ie 177 (u) du, and the symbol ‘d means 
that variables coincide in distribution. 

Thus, if we can register the values of the process H = (Ht)t>o only at discrete 
instants n = 1,2,..., then the observed sequence hn = Hn — Hn~1 has a perfectly 
simple structure of a Gaussian sequence (On€n)n>1 of independent random variables 
with zero means and, in general, inhomogeneous variances (volatilities) 02. 

In the discussion that follows we present a method for transforming the data so 
that the inhomogeneous variables ¢2, n > 1, become ‘flattened’. 

We set 


t 
= | o7(u) du 
r= | Pwa 2) 
and j 
r*(0) = int [ o?(u) du = o} (= inf{t: r(t) = 8}), (3) 


where @ > 0. 
t 
We shall assume that a(t) > 0 for each t > 0, i a*(u) du < œ (this property 


t 
ensures that the stochastic integral Í o(u) dB, with respect to the Brownian mo- 


tion B = (Bu)u>o is well defined; see Chapter III, § 3c), and let fow) du t œ 
as t > oo. 
Alongside physical time t > 0, we shall also consider new, operational ‘time 0 
defined by the formula 
9 = r(t). (4) 
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The ‘return’ from this operational time @ to physical time is described by the 
inverse transfomation 


t=7*(6). (5) 
We note that 


7*(8) 
: o*(u) du = (6) 
0 


by (3), i.e., 7(7*(0)) = 0, so that r*(6) = 77! (0) and r*(r(t)) = t. 

We now consider the function 0 = r(t) performing this transformation of phys- 
ical time into operational time. 

Since 


t2 
82 — 4 =f o? (u) du, (7) 
tı 


we see that if the activity o? (u) is small, then this transformation ‘compresses’ the 
physical time (as in Fig. 41). 


l 
------------------> 6 =7r(t) 


0 th t2 t 


FIGURE 41. ‘Compression’ of a (large) time interval [ty, t2] 
characterised by small activity into a (short) interval [01,82] 
of 6-time 


On the other hand, if the activity a? (u) is large, then the process goes in the 
opposite direction: short intervals (t1,¢2) of physical time (see Fig. 42) correspond 
to larger intervals (01,02) of operational time; time is being ‘stretched’. 

Now, we construct another process, 


Hy = H,«(g); (8) 


which proceeds in operational time. Clearly, we can ‘return’ from H* to the old 


process by the formula 
Hi = Hr), (9) 


because 7*(7(t)) =t. 
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0| 


FIGURE 42. ‘Stretching’ a (short) interval [tı, t2] characterized 
by large activity into a (long) interval (61, 62] of 6-time 


1 t2 t 


Note that if 6, < 82, then 
* * T” (62) 
Ho, ag Ho, = H,« (02) = H,«(6,) = I a(u) dBu 
T*(01) 


a I(r*(01) < u < 7*(02)) o (u) dBu- 
J0 


ll 


Hence H* is a process with independent increments, Hj = 0, and EHS = 0. More- 
over, by the properties of stochastic integrals (see Chapter III, §3c) we obtain 


2 
E| Hp, - Hj,|’ = i I(1*(01) < u < 7*(82)) 0? (u) du 


T* (82) 2 
= | a“ (u) du = 02 — 01 (10) 
JT*(01) 
(the last equality is a consequence of (6)). 
Since H* is also a Gaussian process and has independent increments, zero mean, 


property (10), and continuous trajectories, it is just a standard Brownian motion 
(see Chapter III, § 3a), so that 


6 
Hi = o*(u) dH, (11) 
0 


where o*(u) = 1. 
t 
A comparison with the representation Hy; = fro) dBu, where oa(u) # 1 in 


general, shows that the transition to operational time has ‘flattened’ the character- 
istic of activity oø = o(u): when measured with respect to new ‘time’ 0, the level of 
activity is flat (o*(u) = 1). 
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We have assumed above that o(u) is deterministic. However, Hj = H,+(g) will 
be a Wiener process even if the change of time is random, defined by formula (3) 


t 
with o(u) = o(u;w), provided that J c? (u, w) du < oo with probability one and 


t A 
fru; w)du Tt co as t + oo. However, there exists a fundamental distinction 


between the cases of a deterministic function o = o(u) and a random function 
o =o(u;w): in the first case we can calculate the change t ~> 0 = T(t) in advance, 
also for the ‘future’, which is impossible in the second case, because the ‘random’ 
changes of time corresponding to distinct realizations o = o(u;w), u > 0, are 
distinct. 


2. We consider now a sequence h = (hn)n>1, where hn = on€n with nonflat level 
of activity on, n > 1. We shall treat n as physical (‘old’) time. 
We introduce the sequence of times 


m 
T*(0) = min{ m >1: D9 of > a}, 
k=1 
with positive integer 0 that we treat as operational (‘new’) time. 


Also, let 
hg = 5 hk 
7*(0—1)<k<7* (0) 


for 0 = 1,2,..., where r*(0) = 0. 
We note that Eh} = 0 aud the corresponding variances are 


Dh% = o| 5 ha = 5 of = 1, 


T*(0—1)<k<7*(0) T*(0—1)}<k<T* (0) 


because the valnes of the o? are usually fairly small (see the table in 8 2b.2). 

Thus, we can say that the transition to new ‘tine’ @ transforms the inhomoge- 
neous sequence h = (hn)n>1 into an (almost) homogeneous sequence h* = (h6)g>1. 

If the a, are random variables, that is on = on (w), and our aim is to calculate 
the change of time for all instants (including the future), then we can implement the 
above-discuss idea of ‘devolatization’ by replacing the ¢2(w) by their mean values 
Eo2(w) or, in practice, by estimates of these mean values. 

We see from the representation hn = onén that if the on are ¥,_1-measurable, 
then Eh? = Eo2. Hence if time n of GMT falls, for example, on Monday, then we 
can take an estimate for Eo2 equal to the arithmetic mean of the values of h? over 
all k such that ((k — 1)A, kA] corresponds to the same time interval of some other 
Monday covered by the database. 

The change of time (2) is defined in terms of o(u) squared, which, of course, is 
not the only possible way to change time t ~ 0 = r(t). For example, we could use 
the values of jø (u)| in place of o?(u). 
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3. This is just the change of time used by Olsen & Associates ((90], [360]-[362]). 
They say that this kind of ‘devolatization’ captures periodicity better and the be- 
havior pattern of the autocorrelation function of the ‘devolatized’ sequence (\h*|) 
for the DEM/USD exchange rate is more ‘smooth’, 

Referring to [90] for the details, we shall present only the result of their statistical 
analysis of the properties of the autocorrelation function of the sequence lA* |. 
D As in § 3c-3, we assume that A is equal to 20 minutes, tn = nA, and 
hn = At, — Hipi 

In Fig. 42 (§3c) we plotted the graph of the empirical estimate R(k) of the 
autocorrelation function 


_ Eltnliins el — Elhnl Elin +e 


R(k) i l 
VD(|hn|) * D(\in+ el) 


(12) 


in which the periodic structure is clearly visible. 

In [90], after ‘devolatization’ and the transition to new, operational ‘time’ 0 
the authors plot a graph (see Fig. 43) of the empirical correlation function R*(0), 
6 > 0, for the sequence |h*| = (|h§|)g51. This graph is rather interesting for 
further analysis, In the same paper one can find a graph (see Fig. 44) of the 
transition 7*(@): 0 ~> ¢ from operational time @ to real time ¢t (just in our case of 
the DEM/USD cross rate; new time is normalized so that a week of physical time 
corresponds to a week of operational time). 

It is clear from Fig. 44 that the function t = r*(0) is approximately linear 
during the five business days, while on the week-end, when the transactions in the 
FX-market fade, large intervals of physical time correspond to small intervals of 
operational time. (Only the latter are in fact interesting for business trading). 


4. It should be noted that the above method of ‘devolatizing’ in order to eliminate 
the periodic coniponent is not the only one used in the analysis of the FX-market, 
For instance, we can point out the papers [7], [13], and [306], where most diversified 
techniques, linear and nonlinear regression analysis, Fourier transformation, neural 
networks are used to find similar patterns in financial time series. We also point out 
the paper [297] by I. L. Legostaeva and this author, which relates to the same range 
of problems. In it (in connection with the study of the Wolf numbers characterizing 
solar activity), to analyze the ‘trend’ component a = (az) of a sequence £ = (£k) 
with an additive ‘white noise’ 7 = (nk) (Ek = ak + nk), the authors use a mini- 
max approach suitable for the study of a much broader class of ‘trend’ sequences 
a = (ax)k>1 than the standard polynomials classes of regression analysis. (See also 
[45], [338], and [416].) 


5. For conclusion we present a graph of the periodic component of the ‘activity’ 
(see § 3b) corresponding to/the CHF/USD cross rate, which is isolated by means of 
‘devolatization’ (see [90]; cf. also Fig. 37 in § 3b). 
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FIGURE 43. Empirical autocorrelation function R*(0) for the se- 
quence {h"| = (|hg|)e>1 of the absolute values of ‘devolatized’ vari- 
ables considered with respect to operational ‘time’ @, with interval 
A9 = 20 m; the case of the DEM/USD cross rate ((90}) 

th 


168 + 


144 | 


120 + 


96 + 


+ + + + > 

0 24 48 72 96 120 144 168 @ 
FIGURE 44. Graph of the transition t = 7* (0) from operational time 8 (plotted 
along the horizontal axis) to physical time (along the vertical axis). Time is 
measured in hours; 168 hours form 1 week ((90}) 
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FIGURE 45. The thick curve is the periodic component of the 
‘activity’ corresponding to the CHF/USD cross rate (over the 
period of 168 hours= 1 week) 


In Fig. 45 we clearly see the geography-related structure of the periodic compo- 
nent (with 24-hour cycle during the week) arising from the differences in business 
time between three major FX-markets: East Asia, Europe, and America. In [D4] 
one can find an interesting representation of this component as the sum of three 
periodic components corresponding to these three markets. This can be helpful if 
one wants to take the factor of periodicity more consistently into account in the 
predictions of the evolution of exchange rates. 


§3e. ‘Cluster’ Phenomenon and Aftereffect in Prices 


1. We assumed in our initial scheme that the exchange rates (prices) S = (St)i50 
: . S ; . : 
and their logarithms Hy = In re could be described by stochastic processes with 
0 


discrete intervention of chance: 


St = So + Ý. sm I(Tn < t) (1) 
n21 
and 
Hi = X hr, I (Tn < t). (2) 
n>1 
After that, we proceeded to their continuous modifications S = (Si)eso, H = 


il 
> 


(He)150 and, finally, to the variables hn = A, — Hitni, where tn — tn-1 
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(= const). It was for these variables hn and for A equal to 1 minute that we dis- 
covered the negative autocovariance p(k) = Ehnhn+k — Ehn Ehnpk for small values 
of k (k = 1, 2,3, 4)). For k large the autocovariance is close to zero, therefore we can 
assume for all practical purposes that the variables hy and oer are uncorrelated 
for such k. 

Of course we are far from saying that they are independent; our analysis of the 
empirical autocovariance function R(k) in §3c (for the DEM/USD cross rate, as 
usual) showed that this was not the case. 

The next step (‘devolatization’) enables us to flatten the level of ‘activity’ by 
means of the transition to new, operational time that takes into account the different 
intensity of changes in the values of the process H= (Hi)e>0 on different intervals. 

As is clear from the statistical analysis of the sequence (\An|)n>1 considered with 
respect to operational time 0, the autocorrelation function R*(0) 


1) is rather large for small 0; 
2) decreases fairly slowly with the growth of 6. 


It is maintained in [90] that for periods of about one month the behavior of 
R* (0) can be fairly well described by a power function, rather than an exponential; 
that is, we have 

R*(0)~kO-* as 900, (3) 


rather than 
R*(0) ~ kexp(-08) as 03.0, (4) 


as could be expected and which holds in many popular models of financial mathe- 
matics (for instance, ARCH and GARCH; see [193] and [202] for greater detail), 

This property of the slow decrease of the empirical autocorrelation function 
R*(0) has important practical consequences. It means that we have indeed a strong 
aftereffect in prices; ‘prices remember their past’, so to say. Hence one may hope 
to be able to predict price movements. To this end, one must, of course, learn to 
produce sequences h = (hn)n>1ı that have at least correlation properties similar 
to the oues observed in practice. (See [89], [360], and Chapter II, §3b in this 
connection). 


2. The fact that the autocorrelation is fairly strong for small 6 is a convincing ex- 
planation for the cluster property of the ‘activity’ measured in terms of the volatility 
of |An|. 

This property, known since B. Mandelbrot’s paper [322] (1963), essentially 
means the kind of behavior when large values of volatility are usnally followed 
also by large values and small values are followed by small ones. 

That is, if the variation |n} = Hi, — Hz,_,| is large, then (with probability 
sufficiently close to one) the next value, Pensil; will also be large. while if Ihn] is 
small, then (with probability close to one} the next value will also be small. This 
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property is clearly visible in Fig. 46; it can be observed in practice for many financial 


indexes. 
0.010 +: 
0.005 + l 
Pay . 
i 1 
0.000 HRAL l 
Li 
—0.005 + 
0.010 + a + 
0 504 1008 1512 


FIGURE 46. Cluster property of the variables hk =h 
DEM/USD cross rate (the data of Reuters; October 5, 1992-November 
2, 1992; [427]). The interval A is 20 minutes, the mark 504 corresponds 
to one week, and 2016 corresponds to four weeks. The clusters of large 


aud small values of A are clearly visible. 


We note that this cluster property is also well captured by R/S-analysis, which 


we discuss in the next section. 


4. Statistical R/s-Analysis 


§4a. Sources and Methods of R/S-Analysis 


1. In Chapter III, § 2a we described the phenomena of long memory and self- 
similarity discovered by H. Hurst [236] (1951) in the statistical data concerning 
Nile’s annual run-offs. This discovery brought him to the creation of the so-called 
R/S-analysis. 

This method is not sufficiently well-known to practical statisticians. However, 
Hurst’s method deserves greater attention, for it is robust and enables one to detect 
such phenomena in statistical data as the cluster property, persistence (in follow- 
ing a trend), strong aftereffect, long memory, fast alternation of successive values 
(antipersistence), the fractal property, the existence of pertodic or aperiodic cycles. 
It can also distinguish between ‘stochastic’ and ‘chaotic’ noises and so on. 

Besides Hurst’s keystone paper [236], a fundamental role in the development 
of R./S-analysis, its methods and applications belongs to B. Mandelbrot and his 
colleagues ((314], [316]-[319], [321]-[325], [327]-[329]}, and also to the works of 
E. Peters (which include two monographs [385] and [386]), containing large stocks of 
(mostly descriptive) information about the applications of R/S-analysis in financial 
markets. 


S 
2. Let S = (Sn)nz0 be some financial index and let hn = ln — nèl. 
n—-1 
The main idea of the use of R/S-analysis in the study of the properties of 
h = (hn)n>1 is as follows. 


Let Hn = hy +--+ hn, n 2 1, and let 


k k 
Rn = max (Hy - EHn) min (22, — Emn) (1) 
ksn n kín n 


(cf. Chapter III, § 2a). 
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ee Ay, . ri 
The quantity hy = — is the empirical mean of the sample (h1, h2,..., Rn), 
n 


k > DE 2 
therefore Hg — -Hn = Ð` (hi — hn) is the deviation of H} from the empirical mean 
n i=1 


k : : 
value —Hy, and the quantity Rn itself characterizes the range of the sequences 


n 
(h1,...,/m) and (H1,..., Hn) relative to their empirical means. 
Also let 
Petty, ES D LE Ea 
Sh = 7 aL: (2 x In) = X (ak — hn) (2) 
k=1 k=l k=1 


be the empirical variance and let 
— Rn 
Qn = Se (3) 


be the normalized, ‘adjusted range’ (in the terminology of [157]} of the cumulative 
sums Hy, k <n. 

It is clear froin (1)~-(3) that Qn has the important property of invariance under 
linear transformations hy, — c (hk +m), k > 1. This valuable property makes this 
statistics nonparametric (at any rate, it is independent of the first two moments of 
the distributions of hy, k > 1). 


3. In the case when h1, ho,... is a sequence of independent identically distributed 
random variables with Eh, = 0 and Dhn = 1, W. Feller [157] discovered that 
ERn ~ 4 [one (= 1.2533... x n!/?) (4) 
and 5 
i T us 
DRy ~ (> ie 5)” (= 0.07414... x n) (5) 


for large n. 

This result can be readily understood using the Donusker~Prokhorov invariance 
principle (see, e.g., [39] or [250]) asserting that the asymptotic distribution for 
Rn/Vn is the distribution of the range 


0 ‘ 0 
R= D Bi — ee Bi (6) 


of a Brownian bridge B? = (BP yc. i.e., of the process 
B? = Bı — tB), (7) 


where B = (Bt)t>0 is a standard Brownian motion (see Chapter III, § 3a). 
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Indeed, let us consider the variable 


Rip oe lal cl ie. Was ‘A 
— = ma : 
Vn kenl/rno nyn] k<sn|yn nyn 


By the very meaning of invariance (i.e., the independence of the particular form 
of the distribution of the hp), in looking for the limit distribution for Ry/ Vn as 
n — oo we can assume that the hẹ have the standard normal distribution (0,1). 
IfB= (Be)eso is a standard Brownian motion, then the probability distributions 
for the collections {Hk/ Vn, S ee sn} and {Bayns k= Tuai n} are the same, 
so that 


Rn a ma [Bi = tBı] E min [Bi z tBy| 
n {t 1-1 r-l] {t: toi fat} 
= max B? — min BP, (9) 
{t:t=2,...% 21} {t: t=}, 2 1} 


where the symbol d, means coincidence in distribution. 

Hence it is clear that, as n — oo, the distribution of Rn /yn (weakly) converges 
to that of R*. (Note that the functional (max —-min)(-) is continuous in the 
space of right-continuous functions having limits from the left. See, for example, 
[39; Chapter 6], [250; Chapter VI], and [304] about this property and the convergence 
of measures in such function spaces.) 

We know the following explicit formula for the density f*(x) of the distribution 
function F*(x) = P(R* < x) (see formula (4.3) in [157]): 


f* = xe” (x) + 5 {2k(k — 1)[e'((k - 1)z) - e'(kzx)| 
k=2 


+(k-1)?xe"((k -1)a) + k?xe(kx)}, (10) 


where e(z) = e722, 


Using this formula it is easy to see that 


2 
Kaa R N 
ER* = E and DR ( 5 z) (11) 


We note by the way that E sup |B| = y/ 7/2; see Chapter HI, § 3b. Hence the mean 
t<1 


values of the statistics R* and sup|B;| are the same. 
t<1 


_ 
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4. Assuming that the variables hy,ho,... are independent and identically dis- 
tributed, with Eh; = 0 and Dh; = 1, we see that S2 4 1 with probability one 
as n — œ (the Strong law of large numbers). Hence the limit distribution for 
On//n as n > œ is also the distribution of R*. 

The distribution of Qn is independent of the mean value and the variance of 
hy, k < n. This ‘nonparametric’ property brings us to the following criterion, 
which enables one to reject (with one or another level of reliability) the following 
hypothesis % lying in the foundation of the classical concept of an efficient market 
(see Chapter I, §§ 2a,e): the prices in question are governed by the random walk 
model. 

The main idea of this criterion, which is based on the R/S-statistics, is as follows 
(G. Hurst, [236]; [329], [386]). 

If hypothesis % is valid, then for sufficiently large n the value of Rn /Sn must 


R 
be close to Eg — ~ = n, therefore 
S, 2 


Rn mol 


(Of course, one can—and must—assign a precise probabilistic meaning to this re- 
lation and to relations (13) and (14) below by referring to limit theorems. We 
shall not do this here, taking the standpoint of a ‘sensible’ interpretation, which is 
common in practical statistics.) 

Thus, choosing a logarithmic scale along both axes we see that the values of 
In Rn must ‘cluster’ (provided Xo holds) along the straight line a + linn with 


on 


a=I|n,/m/2 (see Fig. 47). 
Rn 
ng 4 
2.0 + 


Or 


0. i + + + ł : > 
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FIGURE 47. To R/S-analysis (the case of the validity of hypothesis 4%) 


This makes clear the method of R/S-analysis: given statistical data, we plot the 


F R ? ; ; 2 
points (in, In =) (i.e., we use logarithmic scales); then, using the least squares 
n 
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method, we draw a line Gp + bn Inn. If the value of bn ‘significantly’ deviates 
from 1/2, then hypothesis Ho must be rejected. (Of course, one must master the 
standard methods of statistical analysis so as to be able to calculate the significance 
of the deviation of bn from 1/2 under the hypothesis #%. This is not easy because 
the distribution of the statistics Rn/Sn with finite n is difficult to find explicitly; 
we shall touch upon this subject below, in subsection 6.) 

The principal significance of Hurst’s research lies in his discovery (by means of 
R/S-analysis) that, in place of the expected (for Nile and other rivers) property 


Rna on? (13) 
n 
one actually has 
R 
2 ~ en" (14) 
Sn 


with H considerably larger than 1/2. 


5. If an empirical study shows (somewhat unexpectedly) that (14) holds, then one 
should ask about models governing sequences h = (hn)n>1 with this property. 

We must also explain why H > 1/2 in many cases. (As we shall see below, 
one possible explanation is that h = (hn)n>1 is a sequence with long memory and 
positive correlation. ) 

In this connection, we discuss now several ideas, which are based, in particular, 
on numerical computer calculations and several results of R/S-analysis. Doing that 
we shall mainly follow [316] and [319]. 

As a rule, if a sequence h = (An)n>1 is characterized by ‘weak dependence’ (as 
are Markov, autoregressive, and some other sequences) then H (the Hurst pareme- 
ter) is close to 1/2. One usually says in this case that the system h = (hn)n>1 has 
a ‘short R/S-memory’. 

If hy = Bu(n) — By(n — 1), n > 1, where By = (Bu(t)):50 is a fractional 
Brownian motion (Chapter III, § 2b), then the variables ae 
limit distribution as n — oo, which can be indicated as in (14). Hence if a statistical 
study shows that 0 < H < 1 and H # 1/2, then a fractional Gaussian noise is one 
possible explanation. 

We recall (see Chapter III, § 2c) that in the case of such a noise, the correspond- 
ing correlation is positive if H > 1/2 and negative if H < 1/2. This explain why one 
talks about persistence in trend-following, a ‘long memory’, a ‘strong aftereffect’ in 
the first case: if the values have increased, then the odds are that they will increase 
further. 

As pointed out in [386], the (hitherto widespread in the literature) opinion that 
H must be always larger than 1/2 in financial time series is ungrounded. The case 
H < 1/2 occurs, for instance, for the returns of volatility (see Chapter III, § 2d.5 
and §4b.3 in this chapter), which are featured by strong alternation (‘antipersis- 
tence’). 


have a nontrivial 
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6. We have already mentioned that in using R/S-analysis to determine the values 
of H in the conjectural relation (14) we must, of course, determine the degree of 
agreement between the empirical data and the model corresponding to the chosen 
value of H. 

In other words, there arises a standard question on the reliability of statistical 
inference, and one has to turn to the ‘goodness-of-fit’, ‘significance’, or other tests 
developed in mathematical statistics. 


: : .. RR 
It should be pointed out that the complexity of the statistics A leaves no hope 


n 
for satisfactory formulas describing its probability distributions for various n even if 
we accept the null hypothesis #. (Nevertheless, the question of the behavior of the 


R : : oa A 
mean value Eo —", where Eo is averaging under condition #%, has been considered 


Sn 
in [8].) 
This explains the widespread use of the Monte Carlo method in R/S-analysis 
(see, for instance, [317], [329], [385], [386]), in particular, to assess the quality of 
the estimation of the unknown value of H. 


7. When adjusting theoretical models to real statistical data it is reasonable to 
start with simple schemes allowing an easy analytic study. For instance, describing 
the probabilistic structure of the sequences h = (hn)n>1 one can well assume that 
this is fractional Gaussian noise with 0 < H < 1. If H = 1/2, then we obtain usual 
Gaussian white noise, which lies at the core of many linear (AR, MA, ARMA) and 
nonlinear (ARCH, GARCH) models. 

R/S-analysis produces good results in models where h = (hn)n>1 is a Sequence 
of a fractional Gaussian noise kind (see [317], [329], [385], [386]). In dealing with 


other models it is reasonable to consider, besides Qn = =Æ, the statistics 
m 
Qn 
Vn = = 15 
nae (15) 


A mere visual examination of the latter often brings important statistical conclu- 
sions. 

This method is based on the thesis that, for ‘white’ noise (H = 1/2) the statistics 
Va must stabilize for large n (that is, Vn — c for some constant c, where we 
understand the convergence in a suitable probabilistic sense). 

If h = (hn)nz1 is fractional Gaussian noise with H > 1/2, then the values of Vn 
must increase (with n); by contrast, Vn must decrease for H < 1/2. 

Bearing this in mind we cousider now the simplest, autoregressive model of order 
one (AR(1)), in which 


W 
= 


hn = œo + Q1hn-1 + €n, n (16) 


The behavior of the sequence in this model is completely determined by the ‘noise’ 
terms €,, and the initial value họ. 
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The corresponding (to these hn,n > 1) variables V = (Vn)n>1 increase with n. 
However, this does not mean that we have here fractional Gaussian noise with 
H > 1/2 for the mere reason that this growth can be a consequence of the linear 
dependence in (16). 

Hence, to understand the ‘stochastic’ nature of the sequence € = (€n)n>1 in 
place of the variables h = (hn)n>1 one would rather consider the linearly adjusted 
variables h? = (h? not, where h? = hn — (ap + @1hn—1) and ag, a, are some 
estimates of the (unknown, in general) parameters, ag and Qj. 

Producing a sequence h = (hn)n>1 in accordance with (16), where € = (€n)n>1 
is white Gaussian noise, we can see that, indeed, the behavior of the correspond- 
ing variables Vj = Vp(h°) is just as it should be for fractional Gaussian noise 
with H = 1/2. The following picture is a crude illustration to these phenomena: 


$ 


1.4 + 
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FIGURE 48. Statistics Vn (h) and Vp = Vn(h°) for the AR(1) model. 
Rn 
SnJVn 


The expectation Eo( is evaluated under condition % 


If we consider the linear models MA(1) or ARMA(1, 1), then the general pattern 
(see [386; Chapter 5]) is the same as in Fig. 48. 


For the nonlinear ARCH and GARCH models realizations of Vn (h) and Vpn (h°) 


show another pattern of behavior relative to Eo( }, n > 1 (see Fig. 49). 


Rn 
Sn yn 
First, the graphs of V,(h) and Vn(h°) are fairly similar, which can be interpreted 
as the absence of linear dependence between the hn. Second, if n is not very large, 


R 
then the graphs of V,(h) and V,,(A°) lie slightly above the graph of Eo (z T 
nyn 


corresponding to white noise, which can be interpreted as a ‘slight persistence’ in 
the model governing: the variables hn, n 2 1. Third, the ‘antipersistence’ effect 
comes to the forefront with the growth of n. 
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FIGURE 49. Statistics Va(h) and Vp = Vn(h°) for the ARCH(1) model 


Remark. The terms ‘persistence, ‘antipersistence’, and the other used in this section 
are adequate in the case of models of fractional Gaussian noise kind. However, the 
ARCH, GARCH, and kindred models (Chapter II, § § 3a, b} are neither fractional 
nor self-similar. Hence one must carry out a more thorough investigation to explain 
the phenomena of ‘antipersistence’ type in these models. This is all the more 
important as both linear and nonlinear models of these types are very popular in 
the analysis of financial time series and one needs to know what particular local or 
global features of empirical data related to their behavior in time can be ‘captured’ 
by these models. 


8. As pointed out in Chapter I, § 2a, the underlying idea of M. Kendall’s analysis 
of stock and commodity prices [269] is to discover cycles and trends in the behavior 
of these prices. 

Market analysts and, in particular, ‘technicians’ (Chapter I, § 2e) do their re- 
search first of all on the premise that there do exist certain cycles and trends in the 
market, that market dynamics has rhythmic nature. 

This explains why, in the analysis of financial series, one pays so much attention 
to finding similar segments of realizations in order to use these similarities in the 
predictions of the price development. 

Statistical R/S-analysis proved to be very efficient not only in discovering the 
above-mentioned phenomena of ‘aftereffect’, ‘long memory’, ‘persistence’, and ‘an- 
tipersistence’, but also in the search of periodic or aperiodic cycles (see, e.g., [317], 
[319], [329], [385], [386].) 

A classical example of a system where aperiodic cyclicity is clearly visible is 
solar activity. 

As is known, there exists a convenient indicator of this activity, the Wolf numbers 
related to the number of ‘black spots’ on the sun surface. The monthly data for 
approximately 150 years are available and a mere visual inspection reveals the 
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FIGURE 50. R/S-analysis of the Wolf numbers (n = 1 month, 2.12 = logyg(12-11), 
hn = fn — Ln—1, Ln are the monthly Wolf numbers) 


The results of the R/S-analysis of the Wolf numbers (see [385: p. 78]) are roughly 
as shown in Fig. 50. 


Estimating the parameter H we obtain H = 0.54, which points to the tendency of 
keeping the current level of activity (‘persistence’). It is also clearly visible in Fig. 50 


that the behavior pattern of In 


changes drastically close to 11 years: the 
m 


values of i stabilize, which can be explained by a 11-year cycle of solar activity. 
Indeed, if there are periodic or ‘aperiodic’ cycles, then the range calculated for the 
second, third, and the following cycles cannot be much larger than its value obtained 
for the first cycle. In the same way, the empirical variance usually stabilizes. All 
this shows that ®/S-analysis is good in discovering cycles in such phenomena as 
solar activity. 

In conclusion we point out that the analysis of statistical dynamical systems 
usually deals with two kinds of noise: the ‘endogenous noise’ of the system, which 
in faet determines its statistical nature (for example, the stochastic nature of solar 
activity) and the ‘outside noise’, which is usually an additive noise resulting from 
measurement errors (occurring, say, in counting ‘black spots’, when one can take a 
cluster of small spots for a single big one). 

All this taken into account, we must emphasize that R/S-analysis is robust as 
concerns ‘outside noises’; this additional feature makes it rather efficient in the 
study of the ‘intrinsic’ stochastic nature of statistical dynamical systems. 
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§4b. R/S-Analysis of Some Financial Time Series 


1. On gaining a general impression of the R/S-analysis of fractal, linear, and non- 
linear models and its use for the detection of cycles, the idea to apply it to concrete 
financial time series (DJIA, the S&P500 Index, stock and bond prices, currency 
cross rates) comes naturally. 

We have repeatedly pointed out (and we do it also now) that it is extremely 
important for the analysis of financial data to indicate the time intervals A at 
which investors, traders, ... ‘read off’ information. We shall indicate this interval 
A explicitly, and, keeping the notation of § 2b, we shall denote by hy the variable 

Sia 
SG-1a 


consideration. 


hia = In where 5S; is the value at time ¢ of the financial index under 


’ 


Remark. One can find many data on the applications of R/S-analysis to time 
series, including financial ones, in [317], [323], [325], [327], [329] and in the later 
books [385] and [386]. The brief outline of these results that is presented below 
follows mainly [385] and [386]. 


2. DJIA (see Chapter I, §1b.6; the statistical data are published in The Wall 
Street Journal since 1888). We set A to be 1, 5, or 20 days. In the following table 
we sum up the results of the R/S-analysis: 


Number of 
obseravtions 


Estimate 


5 days 1040 
20 days 650 | 0.72 1040 | 


As regards the behavior of the statistics Vn, one can see that the corresponding 
values first increase with n. This growth ceases for n = 52 (in the case of A equal to 
20 days; that is, in 1040 days), which indicates the presence of a cyclic component 
in the data. 

For the daily data (A is 1 day) the statistics Vn grows till n is approximately 1250. 
Thereupon, its behavior stabilizes, which indicates (cf. Fig. 49 in § 4a) a formation 
of a cycle (with approximately 4-year period; this is usually associated with the 
4-year period between presidential elections in the USA). 


3. The S&P500 Index (see Chapter I, § 1b.6; § 2d.2 in this chapter) develops 
in accordance with a ‘tick’ pattern. There exists a rather comprehensive database 
for this index. The analysis of monthly data (A = 1 month) from January, 1950 
through July, 1988 shows ([385; Chapter 8]), that, as in the case of DJIA, there 
exists an approxiniately 4-year cycle. 
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A more refined R/S-analysis with A = 1, 5, and 30 minutes (based on the data 
from 1989 through 1992; [386; Chapter 9]) delivers the following estimates H of the 
Hurst parameter: 


a 


f = 0.603, 0.590, and 0.653, (1) 


which indicates a certain ‘persistence’ in the dynamics of S&P500. 


It is worth noting that the transition from the variables hn = ln = ; Sn = Srna, 
to the ‘linearly adjusted’ ones, 
hy, = hn- (ao te ajhn-1), 


results in smaller estimates of H (cf. (1)). Now, 


~ 


H = 0.551, 0.546, and 0.594, 


which is close to the mean values Eoll (equal to 0.538, 0.540, and 0.563, respectively) 
calculated under condition #% on the basis of the same observations. 

All this apparently means that for short intervals of time the behavior of the 
S&P500 Index can be described in the first approximation by traditional linear 
models (even by simple ones, such as the AR(1) model). 

This can serve an explanation for the wide-spread belief that high-frequency 
interday data have autoregressive character and for the fact that ‘daily’ traders 
make decisions by reacting to the last ‘tick’ rather then taking into account the ‘long 
memory’ and the past prices. The pattern changes, however, with the growth of the 
time interval A determining ‘time sharp-sightedness’ of the traders. In particular, 
the R/S-analysis carried out for A equal to, e.g., 1 month clearly reveals a fractal 
structure; namely, the value H ~ 0.78 calculated on the 48-month basis is fairly 
large (the data from January, 1963 through December, 1989; [385; Chapter 8)]). 

We have already mentioned that large values of the Hurst parameter indicate 
the presence of ‘persistence’, which can bring about trends and cycles. 

In our case, a mere visual analysis of the values of the Vp clearly indicates 
(cf. Fig. 49 in § 4a) the presence of a four-year cycle. As in the case of DJIA, this is 
usually explained by successive economic cycles, related maybe to the presidential 
elections in the USA. 

It is appropriate to recall now (see §3a) that the statistical R/S-analysis of 
the daily values of the variable 7, = ln — , where the empirical dispersion Gp is 

n-1 

calculated by formula (8) in § 3a on the basis of the S& P500 Index (from January 1, 
1928 through December 31, 1989; [386, Chapter 10]) suggests an approximate value 
of the Hurst parameter of about 0.31, which is much less than 1/2 and indicates 
the phenomenon of ‘antipersistence’. One visual consequence of this is the fast 
alternation of the values of 7,. In other words, if for an arbitrary n the value of 
Gn is larger than that of n—1, then the value of n+, is very likely to be smaller 
than Gn. 
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4. R/S-analysis of stock prices enables one not merely to reveal their fractal 
structure and discover cycles, but also to compare them from the viewpoint of the 
riskiness of the corresponding stock. According to the data in [385; Chapter 8], the 
values of the Hurst parameter H = H(-) and the lengths C = C(-) (measured in 
months) of the cycles for the $&P500 Index and the stock of several corporations 
included in this index are as follows: 


H(S&P500) = 0.78, C(S&P500) = 46; 
H(IBM) = 0.72, C(IBM) = 18; 
H(Apple Computer) = 0.75, C(Apple Computer) = 18; 


H(Consolidated Edison) = 0.68, C(Consolidated Edison) = 0.90. 


As we see, the S&P500 Index on its own has Hurst parameter larger than the 
corporations it covers. We also see that the value of this parameter for, say, Ap- 
ple Computer is rather large (H = 0.75), considerably larger than the parameter 
characteristic for, say, Consolidated Edison. 

We note further that if H = 1, then a fractional Brownian motion has the 
representation B,(t) = t£, where € is a normally distributed random variable with 
expectation zero and variance one. All the ‘randomness’ of Bı = (B;(t))¢50 has its 
source in £, so that this is the least ‘noisy’ of all fractional Brownian motions with 
0<H< 1. Clearly, the noise component of the process By ‘decreases’ as H f 1. In 
the financial context this means that models based on these processes become less 
risky. This observation of the decrease of the noise component can be put in the 
form of a precise statement: the process By converges weakly to Bı as H f 1. We 
note also that the correlation functions py(n) = Ehghn+, are positive for H > 1/2 
(‘persistence’, the property of keeping the trends of dynamics) and pg(n) > 1 for 
alnas Hf 1. 

As noted in [385], such an interpretation seems more promising in cases where 
the processes in question have a large Hurst parameter H, but do not have variance, 
which lies at the core of the concept of riskiness. 

One reasonable explanation of the fact that the value of H corresponding to the 
S&P500 Index is large, so that securities based on this index are less risky than 
corporate stock, is diversification (see Chapter I, § 2b), which reduces the noise 
factor. 


Remark 1. We emphasize that by ‘lesser risks’ here we mean ‘less noise’, ‘stronger 
persistence’, showing itself in the tendency to preserve the development trend. It 
should be pointed out, however, that systems with large H are prone to sharp 
changes of the direction of evolution. A long series of ups can be followed by a long 
series of downs. 


Remark 2. Turning back to the values of H and C for the stock of various corpo- 
rations, we should also mention the following observation made in [385]: usually, a 
high level of innovations in a company results in large values of H and short cycles, 
while a low level of innovations corresponds to small values of H and long cycles. 
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5. Bonds. The R/S-analysis of the monthly data concerning the American 30-year 
Treasury Bonds (T-Bonds) over the period from January, 1950 through December, 
1989 (see [385; Chapter 8]) also discovers a rather high value of the fraetality pa- 
rameter H ~ 0.68, with cycles of approximately 5 years. 


6. Currency cross rates. There exist considerable differences between currency 
cross rates and such financial indexes as DJIA, the S&P500 Index, or stock and 
bond prices. 

The point is that a purchase or a sale of stock has a direct relation to invest- 
ments in the corresponding industry. The currency is different: it is bought or sold 
to facilitate consumption, the expansion of production, and so on. Moreover, large 
volumes of exchange influence at least two countries, are determined to a consider- 
able degree by their political and economical situations, and depend strongly on the 
policy of the central banks (who make interventions in the money markets, change 
the interest rates, and so on). 

Of course, these factors influence the statistical properties of cross rates and 
their dynaniics. 

One important characteristic of the changeability of cross rates is the A-volatility 
defined in terms of the increments |HgA — Hix-1)a |; where Hy = In z for the cross 
rate S; (sec, for example, formula (5) in § Ic). 

We must emphasize in this connection that the above-discussed statistics Ry 

n 
Sn 
H = (At)tso. It is therefore not surprising that many properties of exchange rates 
already described in the previous sections can be also detected by R/S-analysis. 

By contrast to such financial indexes as DJIA or the S&P500, the fractal struc- 
ture (at any rate, for small values of A > 0) and the tendency towards its preser- 
vation in time are clearly visible in the evolution of the exchange rates. 


and are also, in effect, characteristics of the changeability, ‘range’ of the process 


a ; : Dug R 
This is revealed by a mere analysis of the behavior of the statisties ln — as 
n 
functions of logn by the least squares method. This analysis shows that the val- 


ues of ln = n > 1, distinctly cluster along the line c + Hn n, with HH consid- 
n 


erably larger than 1/2 for most currencies. For instance, H(JRY/USD) x 0.64, 
H(DEM/USD) ~ 0.64, H(GBP/USD) ~ 0.61. 

All this means that currency exchange rates have fractal structures with rather 
large Hurst parameters. It is worth recalling in this connection that, in the case of a 
fractional Brownian motion, EH? grows as |¢|?". Hence for H > 1/2 the dispersion 
of the values of |H;| is larger than for a standard Brownian motion, which indicates 
that the riskiness of exchange grows with time. Arguably, this can explain why 
high-intensity short-term trading is more popular in the currencies market than 
long-term operations. 
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1. Investment Portfolio on a (B,S)-Market 


$ la. Strategies Satisfying Balance Conditions 


1. We assume that the securities market under consideration operates in the condi- 
tions of ‘uncertainty’ that can be described in the probabilistic framework in terms 
of a filtered probability space 


(Q, F, (Fn)n>0, P). 


We interpret the flow F = (Fn)n>0 of o-algebras as the ‘flow of information’ Fn 
accessible to all participants up to the instant n, n > 0. 
We shall consider a (B, S)-market formed (by definition) by d + 1 assets: 


a bank account B (a ‘risk-free’ asset) 


and 
stock (‘risk’ assets) S = (S?...., S$). 


We assume that the evolution of the bank account can be described by a positive 
stochastic sequence 


B= (Bn) n>0; 


where the variables B, are ¥,_1-measurable for each n 2 1. 
The dynamics of the (value) of the ith risk asset S’ can also be described by a 
positive stochastic sequence 


S* = (Sh)n>0, 


where the St are Fn-measurable variables for each n > 0. 
From these definitions one can clearly see the crucial difference between a bank 
account and stock. Namely, the ¥,1-measurability of Bn means that the state of 
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the bank account at time n is already clear (provided that one has all information) 
at time n — 1: the variable Bn is predictable (in this sense). 

The situation with stock prices is entirely different: the variables Sf are 
#y-measurable, which means that their actual values are known only after one 
obtains all ‘information’ Fn arriving at time n. 

This explains why one says that a bank account is ‘risk-free’ while stocks are 
‘risk’ assets. 


Setting 
ABn i AS} 
T: = ’ Pp = 4 5 1 
(52 Bee (1) 
we can write 
ABn = rn Bn-1, (2) 
ASh = Sh, (3) 


where the (interest rates) rn are Fn-1-measurable and the p, are ¥y-measurable. 
Thus, for n > 1 we have 


Bn = Bo || (+re) (4) 
l<gkgn 


and 
s= [] (+64): (5) 


In accordance with our terminology in Chapter II, § la, we say that the repre- 
sentations (4) and (5) are of ‘simple interest’ kind. 


2. Imagine an investor on the (B, S)-market that can 
a) deposit money into the bank account and borrow from it; 
b) sell and buy stock. 


We shall assume that a transfer of money from one asset into another can be 
done with no transaction costs and the assets are ‘infinitely divisible’, i.e., the 
investor can buy or sell any portion of stock and withdraw or deposit any amount 
from the bank account. 

We now give several definitions relating to the financial position of an investor 
in such a (B, S)-market and his actions. 


DEFINITION 1. A (predictable) stochastic sequence 


n= (8,4) 
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where 8 = (Bnlw))nz0 and y = (ilw)... ¥8(W))nso with ¥,_,-measurable 
GBn(w) and y2(w) for all n > 0 andi = 1,...,d (we set F-ı = Fo) is called 
an investment portfolio on the (B, S)-market. 


If d = 1, then we shall write yn and Sn in place of y} and SJ. 

We must now emphasize several important points. 

The variables By(w) and yį (w) can be positive, equal to zero, or even negative, 
which means that the investor, in conformity with a) and b), can borrow from the 
bank account or sell stock short. 

The assumption of %,~-1-measurability means that the variables B,(w) and 
y4 (w) describing the financial position of the investor at time n (the amount he has 
on the bank account. the stock in his possession) are determined by the information 
available at time n — 1. not n (the ‘tomorrow’ position is completely defined by the 
‘today’ situation). 

Time n = 0 plays a special role here (as in the entire theory of stochastic pro- 
cesses based on the concept of filtered probability spaces). This is reflected by the 
fact that the predictability at the instant n = 0 (formally, this must be equivalent to 
`F _4-measurability’) is the same as Yo-measurability. (The agreement #_1 = Fo 
in the definition is convenient for a uniform approach to all instants n > 0.) 

To emphasize the dynamics of an investment portfolio one often uses the term 
‘investment strategy’ instead. We shall also use it sormetimes. 

We have assumed above that n > 0. Of course. all the definitions will be the 
same if we are bounded by some finite ‘time horizon’ N. In this case we shall 
assume that 0 < n < N in place of n 2 0. 


DEFINITION 2. The value of an investment portfolio m is the stochastic sequence 


X" = (X1) 


n>0" 


where 
d 


XP = PaBn + Y 45%. (6) 
Tat 


To avoid lengthy formulas we shall use ‘coordinate-free’ notation in what follows. 
denoting the scalar product 


d 
(Yn. Sn) = tees 


wl 


of vectors Yn = (Yh: - ... 74) and Sn = (S1,....82) by ynSn. 
Thus. let 
Ka = BnBn + YnSn. (7) 
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For two arbitrary sequences a = (an)n>9 and b = (6n)n>0 we have 
A(anbn) = anAbn + bp—-1 Aan. (8) 


Applying this formula (of ‘discrete differentiation’) to the right-hand side of (7) we 
see that 
? 
AX; a [BnABn + yn ASn] + [Bn-1 Abn + Sn—1 Ayn]. (9) 


This shows that changes (AX? = X7 - X7_,) in the value of a portfolio are, 
in general, sums of two components: changes of the state of a unit bank account 
and of stock prices (8,ABn + ynASn) and changes in the portfolio composition 
(By-~,ABn + Sn-1 Ayn). It is reasonable to assume that real changes of this value 
are always due to the increments AB, and ASn (and not to ABp and Ay). 

Thus. we conclude that the capital gains on the investment portfolio m are 
described by the sequence G” = (G7 )n>z0, where Gj = 0 and 


n 
Gn = >) (BeABe + AS). (10) 
k=1 
Hence the value of the portfolio at time n is 
X3 = X§+Gh, (11) 


which brings us in a natural way to the following definition. 


DEFINITION 3. We say that an investment portfolio m is self-financing if its value 
X7 = (X7)n>0 can be represented as the following sum: 


n 
XI = XF + Y (ABk + ASh) nL (12) 
k=1 


That is, self-financing here is equivalent to the following condition describing 
‘admissible’ portfolios r: 


Bn—-1A8n + Sn-1å7n = 0. n>. (13) 


The message is perfectly clear: the change of the amount on the bank account can 
be only due to the change of the package of shares and vice versa. 

We denote the class of self-financing strategies by SF. 

We note also that we have shown the equivalence of relations 


(6) + (13) == (6) + (12). 
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3. It is perfectly clear that in the operations with the portfolio m one would better 
reduce the number of different assets in it or, at least, simplify their structure. In 
accordance with one possible (and commonly used here) approach, ‘if we a priori 
have Bn > 0, n > 1, then we can set By, = 1’. This is a consequence of the following 


observation. 2 a 
Along with the (B, S)-market we can consider a new market (B, S), where 


B=(Bn)n>o with B,=1 


and 


Š= (Sn)nz0 with Sn = oa . 


T 


Then the value X™ = (X™)n>0 of the portfolio r = (8, y) is as follows: 


Xn 


~ ~ ~ ~ 1 
XT = Bn Bn + YnSn = Bn + YnSn = — (Bn Bn + YnSn) = ——, (14) 
Bn By 


In addition, if ~ is self-financing in the (B, S)-market, then it has this property also 
on the (B, S)-market, for (see (13)) 


x x 1 
Bn-14bn + Sn-1Â Yn = B (Bn-1ABn + Sn-1AYn) = 0. (15) 
AE 
Since AB, = 0, it follows by (12) that 
~ ~ LC ~ 
Ka = XG +) wASe (16) 
k=1 
for r € SF, or, more explicitly, 
© 7 n qd iy si 
Xz =Xő + | hadi], SL ie (17) 
5 k 
k=] t=1 


XT Xa 
Thus, from (14) and (16) we see that the discounted value = n of 
B Bn neo 


x € SF satisfies the relation 
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which, for all its simplicity, plays a key role in many calculations below based on 
the concept of ‘arbitrage-free market’. 
It follows from the above that 


(6) + (12) = > (6) + (18). 


It is easy to verify that the reverse implication also holds. Thus, we have the 
following relation, which can be useful in the construction of self-financing strategies 
and in the inspection of whether a particular strategy is self-financing: 


(6) + (13) = > (6) +(12) = > (6) + (18). 


We note also that (18) is in a certain sense more consistent from the financial 
point of view than the equality 


AXT = Bn ABn + ynASn, (19) 


following from (12). Indeed, comparing prices one is more interested in their rela- 
tive values, than in the absolute ones. (We have already mentioned this in Chap- 


~ B 
ter I, § 2a.4.) This explains why we consider the discounted variables B = B (= 1) 


and § = (3) in place of B and S. 


Our choice of the bank account as a discount factor is a mere convention; 
we could take any of the assets S!,...,S¢ or their combination instead. How- 
ever, the fact that the bank account is a ‘predictable’ asset simplifies the anal- 
ysis to a certain extent. For instance, this ‘predictability’ means that the dis- 


Tv 
tribution Law( ZE | Fa) can be determined from the conditional distribu- 
n 


tion Law(X7 | Fn-1), because the ‘condition’ Fn—ı means that the variable Bn 
is known. Moreover, the bank account plays the role of a ‘reference point’ in eco- 
nomics, a ‘standard’ in the valuation of other securities, which display the same 
behavior ‘on the average’. 


4. The above-discussed evolution of the capital X™ in a (B, S)-market relates to 
the case when there are no ‘inflows or outflows of funds’ and the ‘transaction costs’ 
are negligible. Of course, we can think of other schemes, where the change of capital 
AX? does not proceed in accordance with (19), but has a more complicated form, 
where shareholder dividends, consumption, transaction costs, etc. are taken into 
account. We consider now several examples of this kind. 

The case of ‘dividends’. Let D = (Dn = (D},...,D4))nz0 be a d-di- 
mensional sequence of ¥,-measurable variables DÌ, Di = 0, and assume that 
6, = AD‘, > 0. We shall treat Di, as the dividend income generated by one 
share Si. Then one has a right to assume that the change of the portfolio value 
XT = (Xi )n>0 can be described by the formula 


AXi = BnABn + MmlASn + AD,), (20) 
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while X7 is itself the following combination of the bank account, stock prices, and 
dividends: 
Xn = BnBn + Yn(Sn + ADn). (21) 


It is easy to obtain by (20) and (21) the following, suitable for this case, modi- 
fication of condition (13) of self-financing: 
Bn—1 Abn + Sn—1 Ayn — bn-1Yn-1 = 0. (22) 
Moreover, 
(21) + (22) <= (21) + (20), (23) 


and (18) can be generalized as follows: 


X7 S AD 
a (3) = ma(#) + male (24) 
The case of ‘consumption and investment’. We assume that C = (Cn) nso 
and I = (In)n>0 are nonnegative nondecreasing (AC, > 0, AIn > 0) processes 
with Co = 0, fo = 0, and with ¥,-measurable Cn and In. 
Assume that the evolution of the portfolio value XT = (Xñ )n>0 is described by 
the formula 


AX = Ba ABn + inASn + AIn — ACn. (25) 


This explains why we call this the case with ‘consumption and investment’. The 
process C = (Cn)nzo is called the ‘consumption process’ and I = (In)n so is called 
the ‘investment process’. 

Clearly, if we write X7 = BnBn + YnSn, then (25) brings us to the following 
‘admissibility’ constraint on the components of 7: 


Bn—1 ABn + Sn—-1 AYn = Aln — ACy. (26) 


Relation (18) has now the following generalization: 


xt S A(In — Cn) 
A{ =" )=7,A/[ =E aaa Se Lena 
e) m ()+ Baa oe 


Combining both cases (with ‘dividends’ and with ‘consumption and investment’) 
we see that for an ‘admissible’ (i.e., satisfying (26)) portfolio m we have 


xt Sr ADA - A= oy) 
A(—*\)=m|A 2) 4 z| ooo, 28 
(3) i| (2 Bal Da (a 


The case of ‘operating expense’ (of stock trading). Relation (13) considered 
above has a clear financial meaning of a budgetary, balance restriction. It shows 
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that if, e.g., Ayn > 0, then buying stock of (unit) price S,_; we must withdraw 
from the bank account the amount Bn—1Âß8n equal to Sp_yAyn. On the other 
hand, if Ayn < 0 (i.e., we sell stock), then the amount put into the bank account 
is Bn-1 ABn = —Sn—i Ayn > 0. 

Imagine now that each stock transaction involves some operating costs. Then, 
purchasing Ay, > 0 shares of share price Sp—1, it is no longer sufficient, due to 
transaction costs, that we withdraw S,—;Ayp from the bank account; we must 
withdraw instead an amount (1 + A)S,-1Ayn with A > 0. 

On the other hand, selling stock (Ay, < 0), we cannot deposit all the amount of 
~Sp—1Ayn into our bank account, but only a smaller sum, say, —(1 — p) Sp_1 Ayn 
with y > 0. 

It is now clear that, with non-zero transaction costs, we must replace balance 
condition (13) imposed on the portfolio a by the following condition of admissibility: 


Bn—1ABn + (1 +A)Sn—1 AnI (Ayn > 0) + (1 — 2) Sn—1 AnI (Ayn < 0) = 0, (29) 


which can be rewritten as 


Byn—1 Abn + Sn—1 Ayn + AS; a ey + BSn-1 (Ayn) =0, (30) 
where (Ayn)t = max(0, Ayn) and (Ayn) = — min(0, Ayn). 
If \ = p, then (30) takes the form 
Bn—1 Abn + Sn—1 [Ayn + AlAl] = 0. (31) 


To find a counterpart to (18) we observe that, in view of (30), 
a( 2) = Xn Xa 
Bn Bn Bn 
_ BnBn +7YnSn Bn-1Bn—1 + Yn—1Sn—1 
Bn Bn-1 
Sn Sn-1 
= AB, A| — A 
Bn + Yn (=) + Bhi Yn 


Bn—1 Abn F Sn—-1AYn ( Sn ) 
Bn- I Bn 


Sn-1 
Bn-1 


Sn 


(uA) + Arm) ) + nd (=) 


Hence, if the transaction costs specified by the parameters À and y are non- 
zero and the balance requirement (29) holds, then the evolution of the discounted 


Xr 
portfolio value oe with X7 = Br, Bn + YnSn can be described by the relation 


n 


a($) = wd ( 2) = han [A(Ayn)* + lAn). (32) 
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If A = u, then 


xt S, Saci 
Af E | = mA = \ -A Ayn|. 
(3) Yn (F) Bess | Ynl (33) 


We can also regard this problem of transaction costs from another, equivalent, 
standpoint. 

Let S, = Yn Sp, and B, = BnBn be the funds invested in stock and deposited in 
the bank account, respectively, for the portfolio m at time n. 

Assume that 


AS, = pSn—1 + ALn — AMn, 


A 34 
ABr = =rBn—: ~ (1 +A)ALn + (1 - p)AMy, ey 
where Ly is the cumulative transfer of funds from the bank account into stock by 
time n, which has required (due to transaction costs) a withdrawal of the (larger) 
amount (1 + A)Ln. In a similar way, let Mn be the cumulative amount obtained 
for the stock sold, and let the smaller amount (1 — )AM,, (again, due to the 
transaction costs) be deposited into the account. 
It is clear from (34) that we have the following relation between the total capital 
assets Xn = = Bn + Si: the strategy 7, and the transfers (L, M), where L = (Ln), 
M = (Mn ), and Lo = Mo = 0: 


AX; = rBn-ı F pSn—1 — (AALy + pAM)) 


TBn—1 Bn—1 + pYn—1Sn-1 — (AALn + HAM) 


Bn—-1 ABn + Yn—-1 ASn — AALn + pAMn). 
However, 
AX; = Bn—-1 SBn + n—-1ASn + Bn—1 Abn + Sn—1Ayn- 
Thus, we see that 
Bn—1 ABn + Sn—1 Ayn + AALn + pAMy = 0. (35) 


Setting here 
ALn = n—t(Ayn)T, AMn = Sn~1 (Ayn) 


? 


we obtain that (35) is equivalent to (30). 
The general case. We can combine these four cases into one, the case where 
there are dividends, consumption, investment, and transaction costs. 
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Namely, using the above notation we shall assume that the evolution of the 
portfolio value 
= PnBn + Yn(Sn + ADn) (36) 


can be described by the formula 
AX} = Bn ABn + Y¥n(ASn + ADn) + Ahn 
~ ACh ~ Sp-1 [A(AYn)* Te u(Ayn)~ |. (37) 
By (36) and (37) we obtain the following balance condition: 
Bn—1ABn + Sn—1 Ayn 
aa [ACh — Aln + bn—1Yn—1 + USn-1(AYn)~ + ASn—1 (Ayn) T], (38) 


which describes admissible r. 
Since 


XT Bnu—1ABn + A Ynôn Yn—iôn—1 S, 
A n ASN ł n n ae n o m n +4 A n 
( Bn ) Bn-1 Bn Bn-1 Ma 


it follows in view of (38) that 


xr S AD A — Cn) 
Al & \= A n a) + z] 4 n n 
( Bn )= Ta (; Bn Bn Bn-1 


atat 
Bn-1 


AAyn)* + u(Ayn)~] 689) 
for an admissible portfolio. If A = yu, then 


X7 s AD A(n = Cn)  ASn—1lA7nl 
A TL = nm n n n = n N 
& a a(R) ss Bn | 7 Bn-1 Bn-1 Oy 


. We now turn to formulas (18) and (27) and assume that the sequence 


5 

S Sn . : : f f ; 

Bo ee, is a (d-dimensional) martingale with respect to the basic flow 
n>l 

F = (Fn)nyo and the probability measure P. 


By (18), 3 
Bt h(a) = 


so that (since the yp, k > 1, are predictable) the sequence 


(3 - Z) (42) 
Bn Bo n>0 
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is a martingale transform and, therefore (see Chapter II, § 1c), a local martingale. 
We shall assume that Xf and Bo are constants. Then for each portfolio m such 
that 


X" 
inf "= > C > =œ, (43) 


n Bn 


where C is some constant, we see that 
n 
S Xr 
Yo na( Ft) >c- (44) 
= Bk Bo 


for each n > 1, i.e., the local martingale 


” S 
(Zrel) 
k=l k n>1 


is bounded below and, by a lemma in Chapter II, § 1c is a martingale. 


Thus, if the discounted value of m satisfies (43) (for brevity we shall write 
T 
m € Iliç), then this discounted portfolio value (3 


) is a martingale and, 
n / n20 


in particular, 


Xr Xo 
EŻ = 70 45 
B, = Bo (45) 
for each n > 0. 
We can make the following observation concerning this property. 
S 
Assume that for some (B, S)-market the sequence = = =) is a mar- 
B Bn / n>0 


tingale. (Using a term from Chapter I, § 2a, this market is efficient). Then there 

is no strategy m € Ic (i.e., satisfying condition (43) with some constant C’) that 

would enable an investor (on an efficient market) to attain a larger mean discounted 
T T 


XG 


” than the initial discounted value Be: 


portfolio value E 


An economic explanation is that the strategies m € ile ave not sufficiently risky: 
an investor has no opportunity of ‘running into large debts’ (which would mean that 
he has an unlimited credit). 

In the same way, if 7 € We, Le, 


Xn 

sup— <C <œ (P-as.) (46) 
n Bn 

for some C, then the same lemma (in Chapter II, § le) shows that the sequence 


X7 ; 
(35) is a martingale again, and therefore (45) holds (for an efficient market). 
n/ n0 
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It is interesting in many problems of financial mathematics whether property (45) 
holds after a replacement of the deterministic instant n by a Markov time T = T(w). 
This is important, e.g., for American options (see Chapter VI), where a stopping 
time, the time of taking a decision of, say, exercising the option, is already included 
in the notion of a strategy. 

It can be immediately observed that if a stochastic sequence Y = (Yn, Fn) is a 
martingale, then E(Yn | ¥%n—1) = Yn—1 and E|¥;,| < oo, so that EY, = EY, while 
the property 


EY, = EY, (47) 


does not necessarily hold for a random stopping time T = T(w) < w,w E 2 
(cf. Chapter III, § 3a, where we consider the case of a Brownian motion). It holds 
if, e.g., T is a bounded stopping time (T(w) < N < œ, w € Q). As regards various 
sufficient conditions, we can refer to §3a in this chapter and to Chapter VII, § 2 
in [439], where one can find several criterions of property (47). (In a few words, 
these criterions indicate that one should not allow ‘very large’ r and/or E| Y+] if (47) 
is to be satisfied.) We present an example where (47) fails in § 2b below. 


6. In the preceding discussion we were operating an investment portfolio with no 
restrictions on the possible values of the Bn and yn = (yh, -.., y2); we set Bn E R 
and y$ € R. 

In practice, however, there can be various sorts of constraints. For instance, the 
condition Bn > 0 means that we cannot borrow from the bank account at time n. 
If yn > 0, then short-selling is not allowed. 

If we impose the condition 

S, 
ec <a, 


where a is a fixed constant, then the proportion of the ‘risky’ component YnSn in 


the total capital X} should not be larger than a. 
Other examples of constraints on m are conditions of the types 


P(X EA) 2l-e 
(for some fixed N, £ > 0, and some set A) or 
P(X > fv) =1 


for some ¥y-measurable functions fy = fy(w). 
We consider the last case thoroughly in the next section, in connection with 
hedging. 
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§1b. Notion of ‘Hedging’. Upper and Lower Prices. 
Complete and Incomplete Markets. 


1. We shall assume that transactions in our (B,S)-market are made only at the 
instants n = 0,1,...,N. 

Let fy = fn(w) be a non-negative Fy-measurable function (performance) 
treated as an ‘obligation’, ‘terminal’ pay-off. 

DEFINITION 1. An investment portfolio m = (6,7) with 6B = (Bn), Y = (Yn), 
n=0,1,...,N, is called an upper (x, fy)-hedge (a lower (x, fy)-hedge) for z > 0 
if Xf = z and XX > fn (P-as.) (respectively, XR, < fn (P-a.s.)). 

We say that an (x, fy)-hedge m is perfect if Xf = x, x 2 0, and Xx, = fn 
(P-a.s.). 

The concept of hedge plays an extremely important role in financial mathematics 
and practice. This is an instrument of protection enabling one to have a guaranteed 
level of capital and insuring transactions on securities markets. 

A definition below will help us formalize actions aimed at securing a certain 
level of capital. 


2. Let 
H* (x, fn; P) = {r: Xi =z, XN? ÎN (P-a.s.)} 


be the class of upper (x, fy)-hedges and let 
H,(z, fy;P) = {r: Xf =z, Xy < fy (P-as.)} 


be the class of lower (x, fy )-hedges. 


DEFINITION 2. Let fy be a pay-off function. Then we call the quantity 
C*(fn;P) = inf{x > 0: H*(z, fy; P) 4 Ø} 


the upper price (of hedging against fy). 
The quantity 


C, (fn; P) = sup{z > 0: A, (2, fy;P) # o} 


is called the lower price (of hedging against fy). 


Remark 1. As usual, we put C*(fxy;P) = œ if H*(z, fy; P) = Ø for all x > 0. 
The set H,(0, fy; P) is nonempty (it suffices to consider Bn = 0 and yn = 0). If 
H(z, fy;P) 4 Ø for all x > 0, then Cy (fy;P) = œ. 


Remark 2. We mention no ‘balance’ or other constraints on the investment portfolio 
in the above definitions. One standard constraint of this kind is the condition of 
‘self-financing’ (see (13) in the previous section). Of course, one must specify 
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the requirements imposed on ‘admissible’ strategies m in one’s considerations of 
particular cases. 


Remark 3. It is possible that the classes H* (x, fy; P) and H,(z, fy; P) are empty 
(for some x at any rate). In this case, one reasonable method for comparing the 
quality of different portfolios is to use the mean quadratic deviation 


EIX — fn]? 


(see Chapter IV, § 1d) 


3. We now discuss the concepts of ‘upper’ and ‘lower’ prices. 

If you are selling some contract (with pay-off function fy), then you surely 
wish to sell it at a high price. However, you must be aware that the buyer wants 
to buy a secure contract at a low price. In view of these opposite interests you 
must determine the smallest acceptable price, which, for one thing, would enable 
you to meet the terms of the contract (i.e., to pay the amount of fy at time N), 
but would give you no opportunity for arbitrage, no riskless gains (no free lunch), 
for the buyer has no reason to accept that. 

On the other hand, purchasing a contract you are certainly willing to buy it 
cheap, but you should expect no opportunities for arbitrage, no riskless revenues, 
for the seller has no reasons to agree either. 

We claizn now that the ‘upper’ and ‘lower’ prices C* = C*(fn;P) and 
Cx = Cx(fn;P) introduced above have the following property: the intervals [0, C,) 
and (C*,0o) are the (maximal) sets of price values that give a buyer or a seller, 
respectively, opportunities for arbitrage. 

Assume that the price x of a contract is larger than C* and that it is sold. Then 
the seller can get a free lunch acting as follows. 

He deducts from the total sum x an amount y such that C* < y < x and uses this 
money to build a portfolio *(y) such that xu = y and xn) > fn at time N. 
The existence of such a portfolio follows by the definition of C* and the inequality 
y > C*. (We can describe the same action otherwise: the seller invests the amount y 
in the (B, S)-market in accordance with the strategy n* (y) = (BA (y), rA(Y) o<n<N-) 


The value of this portfolio 7*(y) at time N is Xy 62 and the total gains from 
the two transactions (selling the contract and buying the portfolio 7*(y)) are 


(x ~ fy) + (Xv —y) = (e-y) + (XFO - fy) pay > 0. 


Here x + x7) are the returns from the transactions (at time n = 0 and n = N) 
and fy +y is the amount payable at time n = N and, at time n = 0, for the 
purchase of the portfolio 2*(y). 

Thus, z — y are the net riskless (i.e., arbitrage) gains of the seller. 

We consider now the opportunities for arbitrage existing for the buyer. 
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Assume that the buyer buys a contract stipulating the payment of the amount 
fn at a price x < Cy. To get a free lunch the buyer can choose y such that 
r <y <C. 

By the definition of C, there exists a portfolio mą(y) of initial (i.e., corresponding 
to n = 0) value y such that its value at the instant n = N is xo < fn. Bearing 
this in mind the buyer (who has already paid x for getting the promised amount fy 
at time N) proceeds as follows. 

At time n = 0 he invests the amount (—y) in accordance with the strategy 


W(—y) = —ma(y), where m.(y) = (Ban (Y), Yen() oxn<n: 
Thus, 


F(-y) = (—Ben(y); —n¥) ocne N 


so that (—y) is distributed among bonds and stock in accordance with the formula 


—y = —B40(y) Bo — yso (y) So. 


The value of 7(—y) at time N is 


m—y) _ yTy) _ mx (y) 
Xy ~ Xy ~ -XN 2 
so that the total gains form the two transactions (buying the contract and invest- 
ing (~y)) is 


(fy — 2) + (XE — (-y)) = (ty - XRF”) + w-ar) Sy —2 > 0, 


which, as we see, are the net (arbitrage) gains of the buyer purchasing the contract 
at the price z < Cy. 

In the above discussion we considered the investment of a negative (!) amount 
(-y). What is the actual meaning of this transaction? 

In fact, we already encountered such short selling—e.g., in the discussion of 
options in Chapter I, §1c.4. In the current case of the investment of (—y) this 
means merely that you find a speculator (cf. Chapter VII, § 1a) who agrees to pay 
you the amount y (at time n = 0) in return to your promise of the payment of 


xr) at time n = N; the latter amount can turn in effect to be more or less 
than y due to the random nature of prices. 


Remark. These arguments are (implicitly) based on the assumption that the mar- 
ket is sufficiently liquid: there exists a full spectrum of investors, traders, spec- 
ulators, ..., with different interests, views, and expectations of the behavior of 
prices. All this indicates, incidentally, that for a rigorous discussion one must pre- 
cisely specify the requirements and constraints on the admissible transactions. The 
classes of admissible strategies considered below (see, e.g., Chapter VI, § la) provide 
examples of such constraints that, in particular, rule out arbitrage. 
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Thus, we have two intervals, [0,C,) and (C*, oo), of prices giving opportunities 
for arbitrage. At the same time, if z € [C,,C*], then the buyer and the seller have 
no such opportunities. 

This jnstifies the name ‘interval of acceptable, mutually acceptable prices’ given 
to [C., C*]. 

We emphasize again that a transaction at a mutually acceptable price x € (C,, C*] 
gives no riskless gains to either side. Both can gain or lose due to the random be- 
havior of prices. Hence, as already mentioned, a gain, and especially a big gain, 
must be regarded as 

‘a compensation for the risk’ 


We summarize our discussion of prices acceptable to buyer and seller in the 
following figure: 


Buyer’s preferable Seller’s preferable 
price domain price domain 
0 Cx c* 


Domain of prices acceptable 
for both seller and buyer 


FIGURE 51. Domains of prices preferable and acceptable for buyer and 
seller (Cs = C, (fN; P), C* = C* (fy; P) 


4. We consider now more thoroughly the case where, for fixed z and a pay-off 
fn, there exists a perfect (x, fy )-hedge 7, i-e., a strategy such that X§ = x and 
Xi = fn (P-a-s.). 

The equality Xj) = fn means that the hedge m replicates the contingent 
claim fy. 

It is desirable for many reasons that each obligation be replicable (for some 
value of x). If this is the case, then the classes H* (x, fy;P) and H,(z, fn;P) 
clearly coincide and the upper and lower prices are also the same: 


C* (fn; P) = Cx (fn; P). 


In other words, the interval of acceptable prices reduces in this case to a single 
price 


C(fn;P) (= C* (fn; P) = C.(fn;P)), 


the rational (fair) price of the contingent claim fy acceptable for both buyer and 
seller. As explained above, a deviation from this price will inevitably give one of 
them riskless gains! 

This case, in view of its importance, deserves special terminology. 
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DEFINITION 3. A (B,S)-market of securities is said to be N-perfect or perfect 
with respect to time N, if each F y-measurable (finite) pay-off function fy can be 
replicated, i.e., for some x there exists a perfect (x, fy)-hedge r, a portfolio such 
that 

XN = fn (P-as.). 


Otherwise the market is said to be N -imperfect or imperfect with respect to N. 


In general, the condition that a (B, S)-market be perfect is a fairly strong one 
and it imposes rather serious Constraints on the structure of the market. Inciden- 
tally, it is not necessary in many cases that a perfect hedge exist for all F y-mea- 
surable functions fy; it suffices to consider, for instance, only bounded functions or 
functions in a certain subclass described by some or other conditions of integrability 
and measurability. (See, nevertheless, the theorem in § 4f.) 


DEFINITION 4. We say that a (B, S)-market of securities is N-complete or complete 
with respect to time N if each bounded ¥y-measurable pay-off is replicable. 


The question of whether and when a market is perfect or complete is of a major 
interest for financial mathematics and engineering; to answer it means to show that 
it is in principle possible (or impossible) to make up a portfolio m of value Xj, 
replicating the pay-off fy. (In the case where fy cannot be precisely replicated 
one may, as already mentioned, pose the problem of finding a portfolio delivering 
inf E[X%, — fy]?, where the infimum is taken over all ‘admissible’ portfolios 7; see 
Chapter VI, § 1d in this connection.) 

It seems difficult to answer this question in a satisfactory way in the general 
case, making no additional assumptions about the structure of the market and the 
probability space underlying this market. However, in the case of so-called arbitrage- 
free markets (see the definitions in § 2a below) this problem of ‘completeness’ has an 
exhaustive solution in terins of the uniqueness of the so-called martingale measures 
(see Theorem A in § 2b and Theorem B in § 4a). 

In the next section we consider a single-step model (N = 1) of a fairly simple 
(B, S)-market—an example where one can clearly see how martingale (risk neutral) 
measures enter in a natural way the calculations of the prices C,(fy;P) and 
C*(fn;P), and one can get acquainted with the general pricing priciples based 
essentially on the use of these measures. 


§1c. Upper and Lower Prices in a Single-Step Model 


1. We consider a simple ‘single-step’ model of a (B, S)-market formed by a bank 
account B = (Bn) and some stock of price S = (Sp), where n = 0,1. We assume 
that the constants Bo and So are positive and (see § 1a) 


B= Bo(1 + f); 


E i) 
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where the interest rate r is a constant (r > —1) and the rate p is a random variable 
(ep > —1). 

Since p is the source of all ‘randomness’ in this model, we can content our- 
selves with the consideration of a probability distribution P = P(dp) on the set 
{p: -1 < p < œ} with Borel subsets. 


2. We shall assume now that the support of P(dp) lies in the interval [a, 6], where 
—1 <a < b< oo. (If P(dp) is concentrated at the two points {a} and {b}, then (1) 
is the single-step CRR-model introduced in Chapter II, § 1e.) 

Let f = f(S1) be the pay-off function and, for the simplicity of notation, let 
C*(P) = C*(f;P) and C4 (P) = Cx(f;P). Since Bo > 0, we can assume without 
loss of generality that Bo = 1. 

In our single-step model the portfolio m can be described by a pair of numbers 
8B and y that must be specified at time n = 0. 

In accordance with the definitions in the preceding section, 


C*(P) = inf S 
(P) ua e peT 0) (2) 
and 
C.(P)= sup — (8 + ySo), (3) 
(B,y)€H.(P) 
where 
H*(P) = { (8, y): BBi + yS1 2 f(S1), P-as.} (4) 
and 
H,(P) = {(B, 7): BBi + yS1 < f(S1), P-as.}. (5) 


We consider now the constraint 


B(L+r) + ySo(L +p) > f(So(l + p)) (P-a.s.) (6) 


involved in the definition of H*(P) and introduce (however artificial it may seem 
at first glance) the class A(P) = {P = P(dp)} of distributions on [a,b] such that 


P~P 


(i.e, the measures P and P must be absolutely continuous with respect to each 
other: P < P and P < P) and 


bo 
f Pld) =r (7) 


We assunie that #(P) # Ø. (This holds surely in the CRR-model considered below, 
in § 1d and in detail in § 3f.) 
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We point out the following important property underlying many calculations in 
financial mathematics: 


if a measure P is equivalent to P, then ‘P-a.s.’ in inequality (6) can be 
replaced by ‘P-a.s.’. 


Consequently, if (8, y) € H*(P), then 
BO +r) + ySo(1 +p) > F(So(1 +p) (P-as.), (8) 


therefore, integrating both parts of this inequality with respect to Pe P(P) and 
taking (7) into account we see that 


ENET 


> Ez 
B+ 750 P l+r (9) 
From this inequality we can obviously derive the following lower bound for C*(P): 
C*(P) = inf B + Yso 
! wyeir(P)\ 
> sup py eo) (= x*). (10) 
= 1l+r 
PEP(P) 
In a similar way we obtain an upper bound for C, (P): 
c.(P) <_ mt a E a: (11) 
Pew(P) l+r 
Thus, 
C, (P) SIMS z“ < C*(P), (12) 


hich (provided that P(P) # Ø) proves the inequality C,(P) < C* (P), not all that 
o vious from the definitions of C,(P) and C*(P). 
We proceed uow to property (7), which can be written as 


l+p 
= =] 13 
Pl+tr (13) 
By (1), this is equivalent to 
Si So 
E~— = —. 14 
PB, Bo (14) 


We set Fo = {9,R} and Fı = o(p); then we see that 
Sy So 
Es( + | Fo) =F. 
AG o) Bo 


: ey ; i l z 
i.e., the sequence (2) is a martingale with respect to the measure P. 
n/ n=0,1 
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It is because of just this property (which holds also in a more general case) that 
measures P € (P) are usually called martingale measures in financial mathemat- 
ics. 

We note that their appearance in the calculation of, say, upper and lower prices, 
can at first seem forcible. because there is already a probability measure on the 
original basis and, presumably. all the pricing should be based on this measure 
alone. In effect. this is the case because 


So(1 So(1 
Es f ut p)) _ Ep z(o) E te), (15) 


dP . SA = os 
where z{p) = JP is the Radon-Nikodym derivative of P with respect to P. 


However. the introduction of martingale measures has a deeper meaning because 
their existence is in direct connection with the absence of arbitrage on our (B, S)- 
market. 

We shall discuss this issue in § 2 “Arbitrage-free market”. while here we consider 
the question of equality signs in (10) and (11). 


3. Assume that the function fz = f(So(1 + z)) is (downwards) convex and contin- 
uous on [a,b]. (We recall that each convex function on [a,b] is continuous on the 
open interval (a,b) and can make jumps only at the end-points of this interval.) 


FIGURE 52. Pay-off function fr = f(So(1 + z)) 


We plot a straight line y = y(x) through the points (a, fa) and (b, fp). If it has 
an equation 


y(z) = pSo(L +2) +v, (16) 
then. clearly, 
fo- fa (1 +b) fa- (1+a)fo 
=". and v= 1 
C= Sybaay. (b—a) uo 
Let us introduce the strategy 7* = (6*,y*) with 
v 

Q* = d * = ; 

8 ae and y= p (18) 
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Since (1 +r)8* + So(1+p)7* = v +uSo(1 +p) > f(p) for all p € [a,b], it follows 
that 7* € H*(P), and therefore 
C(P)= inf (8+ $0) < Bt +7 So = —— 


So. 19 
(B,y)€H*(P) cee m 


We now make the following assumption about the set #(P) of martingale mea- 
sures: 
(A*): there exists a sequence (Pn)n>1 of measures in (P) that converges weakly 
to a measure P* concentrated at the two points a and b. 


If (A*) holds, then we see from the equality 


ikay 
Prai+tr 
that fe 
p 
Ep. =1 
R l+r 


Hence the probabilities p* = P*{b} and q* = P*{a} satisfy the conditions 


p* + q* = 1 
and 
bp* +aq* =r 
Consequently, 
x r>a ka UET 
5 b-a Sie ae b-—a (20) 


Further, using our assumption of the weak convergence of Pn to P* again and 
in view of (19), we obtain 


sup Es > lim E5 E, = Ep. 2 
PEPP) +r n nl +r +r 


r—a b-r 
* fo +q" fa = fo de fa 
l+r l+r b-al+r 6b-al+r 


=p 


V * * 
ee Tea? +7750 
> inf B + yS0) = C*(P). 21 
anap) ) (P) eu 


Combined with the reverse inequality (10) this means that 
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Concerning the above discussion it should be noted that our assumption that 
f = f(So(1 + p)) is a continuous convex function of p € [a,b] is fairly standard and 
should encounter no serious objections. 

Assumption (A*) about the existence of martingale measures weakly convergent 
to a measure concentrated at a and b is more controversial. However, it is not that 
‘dreadful’ provided that there exist non-vanishing P-masses close to a and b (i.e., 
for each e > 0 we have Pla, a+e] > 0 and P[b—e, b] > 0). If there exists at least one 
martingale measure P ~ P with the same properties (i.e., such that Pla, a+e]>0, 


Plb — c,b] > 0 for each € > 0, and T pP(dp) = = r), then we can construct the 


required sequence {Pn} by ‘pumping’ the measure P into ever smaller (e | 0) 
neighborhoods [a, a + €] and [b — €,6] of the points a and b preserving at the same 
time the equivalence Pn ~P. 

In the next section we consider the CRR (Cox—Ross-Rubinstein) model in which 
the original measure P is concentrated at a and b and the construction of P* proceeds 
without complications. (Actually, we already constructed this measure in (20).) 

We now formulate the result on C* obtained in this way. 


THEOREM 1. Assume that the pay-off function f(So(1 + p)) is convex and contin- 
uous in p on [a,b] and assume that the weak compactness condition (A*) holds. 
Then the upper price can be expressed by the formula 


So(1 
C*(P)}= sup Epo +m) l (23) 
Pe 2(P) ie 
Moreover, the supremum is attained at the measure P* and 
* r—a fo b—r fa 
= 24 
en) b-al+r b-al+4r’ (24) 
where fp = f(So(1 + p)). 
4. We consider now the lower price Cy. 
By (3) and (11), 
C,(P) = sup (8+~750) <_ inf pate. (25) 
(B.7)EH.(P) Pea) Pl+r’ 


where fp = f(So(1 + 9)), p € [a,b]. 
If fp is (downwards) convex on [a, b], then for each r € (a, b) there exists à = A(r) 
such that 


f(So(1 + p)) > f(So(L+1r)) += r)A(r) (26) 
for all p € [a,b], where 
r) = f(So(1+r)) +(p—r)X(r) 
is the ‘support line’ (a line through r such that the graph of fp lies above it). 
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Let P € (P). Then by (26) we obtain 


i eA +) 5 E, 
PeP(P) l+r Ler 


(27) 


We set 
_ FS +r» 
Bs = l4+r o: A(r) 
and 
A(r) 
* So 


Then (26) takes the following form: for each p € [a,b] we have 


f(So(1 + p)) > (+r) + ¥«S0(1 + p), 
which means that m, = (Bx, yx) € H.(P). 
Following the pattern of the proof of Theorem 1 we assume now that 
(A,): there exists a sequence {Pr}ns1 of measures in “(P) converging weakly to 
a measure P, concentrated at a single point r. 
Then, assuming that fp is a continuous function, we obtain 


fie ieee ahs E SSO D)) 


inf Es < = = 
BESIP) Pi+r n Pnitr PIEP LFT 
= Bs + Soy < sup (B+ Soy) = C,(P), 
(B,y)EH.«(P) 


which, combined with (26), proves the following result. 


THEOREM 2. Let fp be a continuous, downwards convex function on [a,b] and 
assume that weak compactness condition (A,) holds. Then the lower price can be 
expressed by the formula 


c F(So( +0) 


C,(P) = _ inf (28) 
Pep) P 14r 
Moreover, the infimum is attained at the measure P, and 
f- 
C, (P) = —. 29 
(P) = (29) 
dp 


Remark. Let, for instance, P(dp) = an be the uniform measure on [a, 6]. Then 
—a 


conditions (A*) aud (Ax) are satisfied, so that the upper and the lower prices can 
be defined by (24) and (29). 
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5. In the above ‘probabilistic’ analysis of upper and lower prices we took it for 
granted that all the uncertainty in stock prices can be described in probabilistic 
terms. This was reflected in the assumption that p is a random variable with 
probability distribution P(dp). 

However, we could treat p merely as a chaotic variable ranging in the inter- 
val [a,b]. In that case, in place of H*(P) and H,(P) it is reasonable to introduce 
the classes 


H* = {(8,y): BO +r) + ySo(1 + p) > f(So(1 + p)) for each p € [a,b]} 


and 


A, = {(8,y): BA +r) + ySo(1 +p) < f(So(1 + p))for each p € [a,b]}, 


(ie., to replace the condition ‘P-a.s.’ by the requirement: ‘for each p € la, b]’). 
Clearly, A 
H* C H*(P) and H, C H,(P) 


for each probability measure P. 

We note that even if no probability measure has been fixed a priori, there are no 
indications against the (maybe, forcible) introduction of a distribution P = P(dp) 
on [a,b] such that (cf. (7)) 


f obaw =r. 


The class ® of such measures is surely non-empty; it contains the above-discussed 
measures P*, concentrated at a and b (see (20)), and P,, concentrated at a single 
point r. 

By analogy with (2) and (3) we now set 


C= inf_ (@+7So) (> C*(P)) 
(B,y)eH* 


and 


C, = sup _ (8 +750) (< C,(P)). 
(B,y) EH. 


The above arguments (see (8), (9), and (10}) show that for each probability 
measure P we have 


Ct = inf (G+750) > sup pu EN 
(3.)e a Dep l+r 


f(So(1 + e)) 


2 sup E5 rae 


; (30) 
Pe P(P) 
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and 


f(So(1 +e) 
1l+r 


2 ag =. fiat! + p)) 
PEAP) l+r 


C. = sup (8+ So) < inf Es 
BE, Pe? 


(31) 


We note next that the class ¥ contains also both ‘two-point’ measure P* and 
‘single-point’ measure P, considered above, therefore (cf. (21)) 


sup E 2600 +)) 5g, fol +0) _ fo ow fa (32) 
5 a P l+r A l+r Teg Dake 
EP 
and š ea 
mnt Ep UHA cg AOUD h, is 
Pe? l+r E os 1+r 


The strategy 7m* discussed in the proof of Theorem 1 belongs clearly to the 
class H*, and (provided that fp is a (downwards) convex function on [a, b]}) 


a fo fa 
Cc = f Si < ok *S = * * ; 
eae OSE Pye To ee Fe 
which together with (30) and (32) shows (cf. (23)) that 
C* = sup Es Heol te) +ø) : (34) 
Bed ne 
moreover (cf. (24)), 
o* * fo * fa 
eo eg 1 ir’ (35) 
where fp = f(So(1 + p)). 
In a similar way (cf. (28) and (29)), 
Č, = int Et (Sot + 9) . (36) 
Pe? Lr 
moreover, 
A f 
C, = i T ; (37) 


Thus, we have established the following result. 
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THEOREM 3. If the variable p involved in (1) and ranging in [a, b] is ‘chaotic’, then 
for each downwards convex function f = f(So(1 + p)) the upper and lower prices 
C* and C, can be defined by formulas (34), (35) and (36), (37), respectively. 


Remark. Comparing Theorems 1, 2, and 3 we see that if the original probability 
measure P is sufficiently ‘fuzzy’ so that it has masses close to the points a, r, and b, 
then the class %(P) of martingale measures is as rich as P and we have equality 
signs in the incqualities C, < C,(P) and C*(P) < C*. 


§1d. CRR Model: an Example of a Complete Market 
1. We consider again a ‘single-step’ model of a (B, S)-market. in which we assume 


that 
Bı = Boll +r), 


1 
Sı = So(1 +p), 3 
where p is a random variable taking just two values, a and b, such that 


—-l<a<r<ob. (2) 


This simple (B, S)-market is called the single-step CRR-model (after J. C. Cox, 
S. A. Ross, and M. Rubinstein who considered it in [82]). 
We assume that the initial distribution P of the random variable p is as follows: 


p = P{b} > 0, q = P{a} > 0. 


Then the unique (martingale) measure P* equivalent to P and satisfying prop- 
erty (7) in the preceding section can be defined as follows: 


P*{b}=p",  P*{a}=qř*, 


where (see (20) in § 1c) 


S x_b-r 
PS pgi E cane (3) 
and that (see (11) and (12) in § 1c) 
C.(P) < Ep- 22 < C*(P). (4) 
1l+r 


In fact, the prices C,(P) and C*(P) are the same in our case for each pay-off 
function f = f(So(1 + p)), so that their common value is given by the formula 


fo * fo * fa 


Der Tee er 


C(P) = Ep- (5) 
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Actually, all we need for the proof of the equality C,(P) = C*(P) is already 
contained in the discussion in § 1c. 
For let us consider the above-introduced strategy 7* = (G*,y*)} with parameters 


where v and ys are as in formula (17) in § le. 
For both p = a and p = b we have 


B A +r) +y Sol +p) = f(So(1 + p)), 


therefore the pay-off function here can be replicated, so that 7* € H*(P)N H,(P). 
Hence 


C.(P)= sup (8+ 7S) > B* + 7°So 
(8,y)EHs(P) 
> inf (8 + ySo) = C*(P). 
(B,y)EH*(P) 


Combined with (4), this proved the required equality.of C,(P) and C*(P) and 
formula (5) for their common value C(P). 
2. We point out again the following important points revealed by our exposition 
here and in the previous section. They are valid also in cases when the martingale 
methods are used in a more general context of financial calculations relating to a 
fixed pay-off function: 
(I) if the class of martingale measures is non-empty: 
PP) # Ø, (6) 
then 
C. (P) < C*(P) 
(by formula (12), § 1c}; 
(I) if 
H*(P) A H. (P) # Ø, (7) 
i.e., the pay-off function is replicable, then 
C. (P) > C*(P); 
(III) ¿f both (6) and (7) hold, then the upper and the lower prices C,(P) and 
C*(P) are the same. 
We show in the next section that the non-emptiness of the class of martingale 
measures (P) is directly related to the absence of arbitrage. 
On the other hand, the non-emptiness of H*(P)M H.(P) (for an arbitrary pay- 
off function f) is related to the uniqueness of the martingale measure, i.e., to the 


question whether the set #(P) reduces to a single (martingale) measure P equivalent 
to P (P ~P). 


2. Arbitrage-Free Market 


§ 2a. ‘Arbitrage’ and ‘Absence of Arbitrage’ 


1. In a few words, the ‘absence of arbitrage’ on a market means that this market 
if ‘fair’, ‘rational’, and one can make no ‘riskless’ profits there. (Cf. the concept 
of ‘efficient market’ in Chapter I, § 2a, which is also based on a certain notion of 
what a ‘rationally organized market’ must be and where one simply postulates that 
prices on such a market must have the martingale property.) 

For formal definitions we shall assume (as in § la) that we have a filtered prob- 
ability space 

(Q, F, (Fn)n>0; P), 


and a (B, S)-market on this space formed by d + 1 assets: 
a bank account B = (Bn)n>0 
with Fn-1-measurable Bn, Bn > 0, and 
a d-dimensional risk asset S = (St, Sees St), 


where SË = (S*)n50, and the S$ are positive and Fn-measurable. 
Let X" = (Xñ )nz0 be the value 


d 
Xn = BnBn + Ynn (= Bn Bn + 5 Sh). 
i=1 


of the strategy m = (8, y) with predictable 6 = (Bn)nz0 and y = GA dua), 
Y = (Yn)nz0- 


2. Arbitrage-Free Market 411 


If x is a self-financing strategy (m € SF), then (see (12) in § la) 


n 
XR = XG +S (BkABk + HAASE), nèl, (1) 
k=1 


2 XT ] 
and the discounted value of the portfolio X7 = (34) satisfies the relations 
n>0 


a( 5) rale) @ 


which are of key importance for all the analysis that follows. 


2. We fix some N > 1; we are interested in the value Xj, of one or another strategy 
x € SF at this ‘terminal’ instant. 


DEFINITION 1. We say that a self-financing strategy 7 brings about an opportunity 
for arbitrage (at time N) if, for starting capital 


XG = 0, (3) 


we have 
Xi 20 (P-as.) (4) 


and Xf, > 0 with positive P-probability, i.e., 
P(Xf, > 0) >0 (5) 


or, equivalently, 
EX? > 0. (6) 


Let SF rb be the class of self-financing strategies with opportunities for arbi- 
trage. If t € SF, and X¢ = 0, then 


P(X >O)=1 => P(X > 0) > 0. 


DEFINITION 2. We say that there exist no opportunities for arbitrage on a (B, S)- 
market or that the market is arbitrage-free if SF rb = Ø. In other words, if the 
starting capital X of a strategy m is zero, then 


P(X? >0)=1 => P(X%,=0)=1. 
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(a) 


FIGURE 53. To the definition of an arbitrage-free market 


'P(XE <0) >0 


FIGURE 54. Typical pattern of transitions on an 
arbitrage-free market 


Geometrically, this means that if mis  arbitrage-free (and Xf, > 0), then the 
chart (Fig. 53a) depicting transitions from X = 0 to Xj must actually be ‘degen- 
erate’ as in Fig. 53b, where the dash lines correspond to transitions X¢ = 0 + Xf 
of probability zero. 

In general, if Xf = BoBo + yoso = 0. then the chart of transitions in an 
arbitrage-free market must be as in Fig. 54: if P(X}, = 0) < 1, then both gains 
(P(X%, > 0) > 0) and losses (P(X x, < 0) > 0) must be possible. This can also be re- 
formulated as follows: a strategy 7 (with Xf = 0) on an arbitrage-free market must 
be either trivial (i.e.. P(X}; #0) = 0) or risky (i.e., we have both P(X}, > 0) > 0 
and P(X}, <0) > 0). 

Along with the above definition of an arbitrage-free market. several other def- 
initions are also used in finances (see. for instance. [251]). We present here two 
examples. 


DEFINITION 3. a) A (B.S)-market is said to be arbitrage-free in the weak sense 
if for each self-financing strategy 7 satisfying the relations X = 0 and X} > 0 
(P-a.s.) for all n < N we have Yj, = 0 (P-a.s.). 

b) A (B.S)-market is said to be arbitrage-free in the strong sense if for each 
self-financing strategy m the relations Xf = 0 and Xj > 0 (P-a.s.) mean that 
Xa = 0 (P-as.) for all n < N. 
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Remark. Note that in the above definitions we consider events of the form {X x, > O}, 

{X} > 0}, or {Xf = 0}, which are clearly the same as {xe > O}, {Xf > O}, or 
pe a XE 

{XÑ = 0}, respectively, where Xf, = a (provided that By > 0). This explains 
N 

why, in the discussion of the ‘presence’ or ‘absence’ of arbitrage on a (B, S)-markct, 


one can restrict oneself to (B, S)-markets with Bn = 1 and Sn = ae In other 


n 
words, if we assume that Bn > 0, then we can also assume without loss of generality 
that Bn = 1. 


§ 2b. Martingale Criterion of the Absence of Arbitrage. 
First Fundamental Theorem 


1. In our case of discrete time n = 0,1,...,N we have the following remarkable 
result, which, due to its importance, is called the First fundamental asset pricing 
theorem. 


THEOREM A. Assume that a 
(B, S)-market 


on a filtered probability space (Q, F, (Fn), P) is formed by a bank account B = (Bn), 
By > 0, and finitely many assets S = (S!,...,S%), St = (S$). 

Assume also that this market operates at the instants n = 0,1,...,N, 
Fo = {8, Q}, and Fy = F. 

Then this (B,S)-market is arbitrage-free if and only if there exists (at least 


one) measure P (a ‘martingale’ measure) equivalent to the measure P such that the 
d-dimensional discounted sequence 


S _ (Sn 
B \Bn 


is a P-martingale, i.e., 


a 
E| =| < œ 1 
lz (1) 
for alli =1,...,d and n = 0,1,..., N and 
Si Si = 
Es | = |2 = ) =l P-a.s. 2 
(5 Ze ee & 


forn = 1,..., N. 


We split the proof of this result into several steps: we prove the sufficiency in § 2c 
and the necessity in §2d. In §2e we present another proof, of a slightly more 
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general version of this result. Right now, we make several observations concerning 
the meaningfulness of the above criterion. 


We have already mentioned that the assumption of the absence of arbitrage has 
a clear economic meaning; this is a desirable property of a market to be ‘rational’, 
‘efficient’, ‘fair’. The value of this theorem (which is due to J. M. Harrison and 
D. M. Kreps [214] and J. M. Harrison and S. R. Pliska [215] in the case of finite Q 
and to R. C. Dalang, A. Morton, and W. Willinger [92] for arbitrary ©) is that 
it shows a way to analytic calculations relevant to transactions of financial assets 
in these ‘arbitrage-free’ markets. (This is why it is called the First fundamental 
asset pricing theorem.) We have, in essence, already demonstrated this before, in 
our calculations of upper and lower prices (§ §1b,c). We shall consistently use this 
criterion below; e.g., in our considerations of forward and futures prices or rational 
option prices (Chapter VI). 

This theorem is also very important conceptually, for it demonstrates that the 
fairly vague concept of efficient, rational market (Chapter I, § 2a), put forward as 
some justification of the postulate of the martingale property of prices, becomes rig- 
orous in the disguise of the concept of arbitrage-free market: a market is ‘rationally 
organized’ if the investors get no opportunities for riskless profits. 


2. In operations with sequences X = (Xn) that are martingales it is important to 
indicate not only the measure P, but also the flow of o-algebras (Fn) in terms of 
which we state the martingale properties: 


the Xn are ¥n-measurable, 
E|Xn| < œ, 
E(Xn+1|Fn)=Xn (P-a.s.). 


To emphasize this, we say that the martingale in question is a P-martingale or 
a (P, (Fn))-martingale and write X = (Xn, Fn, P). 

Note that if X is a (P, (Fn))-martingale, then X is also a (P, (G,))-martingale 
with respect to each ‘smaller’ flow (Gn) (such that Gn C Fn), provided that the Xn 
are 4,-measurable. Indeed, by the ‘telescopic’ property of conditional expectations 
we see that 


E(Xn41|Gn) = E(E(Xn41|Fn)| Gu) =E(Xn|Gn) = Xn (P-a.s.), 


which means the martingale property. 

Clearly, if X is a (P,(4%,))-martingale, then there exists the ‘minimal’ flow 
(Gn) such that X is a (G,)-martingale: the ‘natural’ flow generated by X, i.e., 
Gy = 0(Xo, X1,..-, Xn). 

We can recall in this connection that we defined a ‘weakly efficient’ market (in 
Chapter I, § 2a) as a market where the ‘information flow (Fn) was generated by 
the past values of the prices off all the assets ‘traded’ in this market, so that (¥n) 
is just the ‘minimal’ flow on such a market. 
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3. One might well wonder if the theorem still holds for d = œ or N = co. 

The following example. which is due to W. Schachermayer ([424]), shows that 
if d = œ (and N = 1). then there exists an ‘arbitrage-free’ market without a 
‘martingale’ measure, so that the ‘necessity’ part of the above theorem fails in 
general for d = ». 


EXAMPLE 1. Let Q = {1,2....}, let Fo = {9.Q}, let F = F, be the o-algebra 
oC 
generated by all finite subsets of Q. and let P= Ð 2-*§, ie. P{k} =27*. 
kal 
We define the sequence of prices S = (St) for i = 1,2,... and n = 0.1 as 


follows: 
l, w=2, 


ASj(w) = 4 -1l, w=i+1, 
0 otherwise. 
Clearly, the corresponding (B, S)-market with Bo = Bı = 1 is arbitrage-free. 


However. marginal measures do not necessarily exist. In fact. the value XF (w) of 
an arbitrary portfolio can be represented as the sum 


x >) 
XT =cot+ 2 cisi = Xf + 5 GAS}. 
i=] i=l 
X 
where X = co + J. c; (here we assume that $ |c;| < œ). If Xf = 0 (i.e. 
221 


x 
co + >) c = 0). then by the condition Xf > 0 we obtain 


XT) =c1 2 0. XT (2) = c2 -c1 PARETET XI (k) = ch - ck—1 20, 


Hence all the c, are equal to zero. so that XT = 0 (P-a.s.). However, a martingale 
measure cannot exist. 3 
For assume that there exests a measure P ~ P such that S is a martingale with 


respect to it. Then for each i = 1.2.... we have 

E535] = 0. 
ie. P{i} = Plo +1} for? = 1.2..... Clearly. this is impossible for a probability 
measure. 


The next counterexample relates to the question of whether the ‘sufficiency’ 
part of the theorem holds for N = x. It shows that the existence of a martingale 
measure does not ensure that there is no arbitrage: there can be opportunities for 
arbitrage described below. (Note that the prices S in this counterexample are not 
necessarily positive. which makes this example appear somewhat deficient.) 
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EXAMPLE 2. Let € = (En)nz0 be a sequence of independent identically distributed 
random variables on (Q, F, P) such that P(€n = 1) = P(€n = —1) = 4. 
We set So = 0, Sn = £4 +--+ En, Bn = 1, and let 


I<ksn l¢gk<n 
where 
x Pa if &1 =- = k-11 = —1, 
k x 
0 otherwise. 


As is well known, we can treat X7 as the gain of a gambler playing against 
a ‘symmetric’ adversary, when the outcomes are described by the variables ¢, 
(he gains if € = 1 and loses if £ = —1) and he doubles the stake after a loss. 

Clearly, if £; = --- = € = —1 (i.e., the gambler loses all the time), then his 
gain is 


k 
Xf =-=)! = -(2* - 1), 
i=1 


ie., he is a net loser. 
However, if he gains at the next instant k + 1, i.e., if k41 = 1, then his gain 
becomes 


Xg = Xf +2* = -(2* -1)+2 =1. 


Hence if the concept of ‘strategy’ embraces (in addition to a choice of a portfolio) 
also a (random) stopping instant T, then the gain of our gambler can be positive. 
For let 


T= inf {k: XE = 1}. 


Since P(r = k) = (4)*, it follows that P(r < œœ) = 1, and therefore EX™ = 1 
because P(X = 1) = 1 (although the starting capital XJ was zero). 

Thus, there exists an opportunity for arbitrage on our (B, S)-market with Bẹ = 
Namely, there exists a portfolio m such that Xf = 0, but the expectation EX? 
for some T. 

We note by the way that the strategy of doubling the stake after a loss used 
here implies that either the gambler is immensely rich or he can borrow indefinitely 
from somewhere; both variants are, of course, hardly probable. 


| 


For this reason, in the considerations of various issues of arbitrage theory, one 
should impose some sound, ‘economically reasonable’ constraints on the classes of 
admissible strategies. (See Chapter VII, § la on this subject.) 
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§2c. Martingale Criterion of the Absence of Arbitrage. 
Proof of Sufficiency 


We claiin that if there exists a martingale measure P equivalent to the measure P 


such ha the sequence 
n O<nsN 


is a (P, (¥n))-martingale, then there can be no arbitrage on the (B, S)-market. 
As already mentioned (see the end of § 2a), the condition Bn > 0, n > 0 (which 
we assume to hold), allows one to set Bn = 1. 
We use formula (2) in § 2a, in accordance to which, 


n 
Xi =X +G; and G} =J WAS, (1) 
k=1 


where S = (Sn) is a P-martingale. 
To prove the required assertion we must show that if 7 € SF is a strategy such 
that Xf = 0 and P(X, > 0) = 1, i.e., 


N 
Gh = >> WAS, 20 (2) 
k=1 


(P-a.s., or, equivalently, P-a.s.), then Gh = 0 (P-a.s., or, equivalently, P-a.s.). 

We use the theorein and the lemma in Chapter II, § 1c. T 

The sequence (G7,)ocn<Nn is a martingale transform with respect to P and, 
therefore, by the theorem a local martingale. Since Gi, > 0, it follows by the 
lemma that (Gh)ocn<n is in fact a P-martingale and therefore, EsGN = G5 =9. 


Hence Xý = Gy = 0 (P-a.s. and P-a.s.). 


§2d. Martingale Criterion of the Absence of Arbitrage. 
Proof of Necessity 
(by Means of the Esscher Conditional Transformation) 


1. We must now prove that the absence of arbitrage means the existence of a 
probability measure P~ Pin (Q, F) such that the sequence S = (Sn)ogngn iS a 
P-martingale. 

There exist several proofs of this result and its generalization to the continuous- 
time case (see, e.g., [92], [100], [171], [215], [259], [443], and [455]). They all appeal in 
one or another way to concepts and results of functional analysis (the Hahn- Banach 
theorem, separation in finite-dimensional Euclidean spaces, Hilbert spaces, etc.). 
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At the same time, they all are ‘existence proofs’ and suggest no explicit con- 
structions of martingale rneasures, to say nothing about an explicit description of 
all martingale measures P equivalent to P. 


It would be interesting for this reason to find a proof of the necessity part of the 
theorem where one explicitly constructs all martingale measures or at least some 
subclass of them. This becomes important once we take into account that in the 
calculation of upper and lower prices we must find the upper and lower bounds over 
the class of all measures P equivalent to the original measure P (see § Ic). 

It is in this way of an explicit construction of a martingale measure that we shall 
carry out the proof. We shall follow the ideas of L. C. G. Rogers [407] and use the 
construction of equivalent measures based on the Esscher conditional transforma- 
tions. 


2. To explain the main idea we consider first the single-step model (N = 1), where 
we assume for simplicity that d = 1, Bp = Bı 1, and Fo = {98,Q}. We also 
assume that P(S; # So) > 0. Otherwise we shall obtain an uninteresting trivial 
market and we can take the original measure P as a martingale measure. 

Now, each portfolio 7 is a pair of numbers (8, y). If X = 0, then only the pairs 
such that 8 + ySo = 0 are admissible. 

The assumption of the absence of arbitrage means that the following two con- 
ditions must be met on such a (non-trivial) market: 


P(AS;>0)>0 and P(AS; <0)>0. (1) 


Hence Fig. 54 in § 2a takes now the following form: 


AS, 


P(AS; > 0) >0 


'P(AS; =0) > 0 


FIGURE 55. Typical arbitrage-free situation. Case N = 1 


We must deduce from (1) that there exists a measure P ~ P such that 
1) E5|45S1| < œ, 
2) E5AS1 = 0. 


It can be useful to formulate the corresponding result with no mention of ‘arbi- 
trage’, in the following, purely probabilistic form. 
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LEMMA 1. Let X be a real-valued random variable with probability distribution P 


on (R, @(R)) such that 
P(X >0)>0 and P(X <0)>0. 
Then there exists a measure P ~ P such that 
Ese* < œ 


for each a € R; in particular, 


Moreover, P has the following property: 
EX =0. 


Proof. Given the measure P, we construct first the probability measure 
2 
Q(dz) =ce~* P(dz), TER, 


where c is a normalizing coefficient, i.e., 


cl= Epe“. 


Let 
pla) = Ege** 
for a € R and let 
ee 
pla) ` 


(2) 


(6) 


(7) 


Clearly, Q ~ P and it follows by the construction of Q that y(a) < oo for each 


a E R that (a) > 0. 


It is equally clear that Za(x) > 0 and E@Zq(x) = 1. Hence for each a € R we 


can define the probability measure 
Pa(dr) = Zq(x)Q(dr) 
such that Pa ~Q~P. 


(8) 


The function y = y(a) defined for a € R is strictly convex (downwards) since 


yp" (a) > 0. 
Let 
ps = inf{p(a): a € R}. 
Then two cases are possible: 
1) there exists a, such that (ax) = px 
or 


2) there exists no such ay. 
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In the first case we clearly have y’ (ax) = 0 and 


Thus. we can take Par as the required measure P in case 1). 
We now claim that assumption (2) rules out case 2). 
Indeed. let {an} be a sequence such that 


Pu < plan) | px. (9) 
This sequence must approach +æ or ~oo since otherwise we can choose a con- 


vergent subsequence and the minimum value is attained at a finite point. which 
contradicts assumption 2). 


Let un = la and let u = lim un (= 1). 
an 
By (2) we obtain 


Q(uX >0)>0. 


Hence there exists 0 > 0 such that 
Q(uX > 6) =e>0, (10) 
and we choose 6 that is a continuity point of Q, i.e., 
Q(uX = 45) =0. 
Consequently, 
Q(anX > dlan|) = Q(unX > 6) +e as n> æ, 
so that for n sufficiently large we have 
plan) = Egetr* > EQ [etn * I (an X > ôļan|})] > se -exp(d|an}) + 2, 
which contradicts (9). where ys < 1. 


Remark. The above method of the construction of probability measures P,, which 
ar 


is based on the Esscher transformation z ~~ defined by(7)is a knownin the 


pla 
actuarial practice since F. Esscher [144] (1932). As regards the applications of this 
transformation in financial mathematics. see. for instance, [177] and [178], and as 
regards its applications in actuarial mathematics. see the book [52]. 
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3. It is easy to see from the above proof of the lernma how one can generalize 
it to a vector-valued case, when one considers in place of X an ordered sequence 
(Xo, X1,---, Xn) of Fn-measurable random variables Xn, 0 < n < N, defined on 
a filtered probability space (Q, F, (Fn), P) with Fo = {9,Q} and Fy = F. 


LEMMA 2. Assume that 
P(X, > 0! F%n-1)>0 and P(Xn <0[F¥n_-1) > 0 (11) 


for 1 <n < N. Then there exists a probability measure P equivalent to P in the 


space (Q, F) such that the sequence (Xo, X1,..., Xn) isa (P, (Fn))-martingale 
difference. 


Proof. If necessary, we can proceed first from P to a new measure Q such that 


N 
Q(dw) = cexpf -DO XPW) b Plau) (12) 
i=0 


N 

y aiX;} is finite. 

TO 

We can construct the required measure P = P(dw) as follows (cf. (8)). 


Let 


and the generating function EQ exp{ 


pn(a;w) = E(e2*n | Fn-1)(w). (13) 


For each fixed w these functions (in view of (11)) are strictly (downwards) convex 
ina. As in Lemma 1 we can show that there exists a unique (finite) value an = an (w) 
such that the smallest value inf n(a;w) is attained at an. 


Since here inf = inf, where Q is the set of rationals, the function y,(w) = 
a a 
inf ~n(a;w) is ¥,_1-measurable, which means that an(w) is also a ¥,_1-measurable 
a 


function. 
Indeed, if [A, B] is a closed interval, then 


{w: an(w) € [A, B]} = N U fw: Pnla;w) < pnlw) + 2) E Fn—1; 
™ acQA[A,B] 


and therefore an (w) is Fn—1-measurable. 
Next, we define recursively a sequence Zo, Zi (w), ..., Zy (w) by setting Zo = 1 


and 
exp{an(w) Xn(w)} 
Eg(exp{anXn} | Fn-1)(w) 


for n > 1. Clearly, the variables Z,(w) are Fn-measurable and form a martingale: 


Zn(w) = Zn—1(w) (14) 


EQ(Zn | F —1) = Zn—-1 (P-a.s.). 
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We define the required measure P by the formula 
P(dw) = Zy (w) P(dw). (15) 


Asin Lemma 1, it easily follows from this definition that E5 Xn|<oo,0<n<QN, 
and 
Es(Xn|¥n-1) = 9, londN. (16) 


By Lemma 1, Es Xo = 0. Hence the sequence (Xo, X1,.-., Xn) is a martingale 


difference with respect to P, which proves the lemma. 


4. For d= 1, the necessity of the existence of a martingale measure PaP 
(in an arbitrage-free market) is a consequence of Lemma 2. For let Xo = So, 
Xı = AS),...,Xn = ASy. Since there can be no arbitrage, we can assume 
without loss of generality that 


P(ASn > 0|%n—1)>0 and P(ASn <0|¥n—1) > 0 (17) 


for each n = 1,...,N. 

Indeed, if P(AS, = 0) = 1 for some n, then we can skip the instant n because 
no contribution to the value Xj of an arbitrary self-financing portfolio 7 can be 
made at this time. 

On the other hand if there exists n such that 


P(ASp > 0) =1 


or P(ASn < 0) = 1, then P(AS, = 0) = 1 due to the absence of arbitrage. 
(Otherwise it is easy to construct a strategy m such that Xf, > 0 with positive 
probability.) Again, the corresponding contribution to Xf, is zero. 

Thus, we can assume that (17) holds for all n < N, and the required necessity 
follows directly by Lemmas 1 and 2 as applied to Xo = So and Xn = ASn, 
lendN. 


5. We consider now the general case of d > 1. Conceptually, the proof is the same 
as for d = 1; it can be carried out using the following generalization of Lemma 2 to 
the vector-valued case. 


LEMMA 3. Let (Xo, X1,-.., Xy) be a sequence of ¥;,-measurable d-vectors 
Xn 
a 


defined on a filtered probability space (Q,F,(Fn)ocn<en,P) with Fo = {8,9}, 
Fn =F. 
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Assume also that if 


Yn = : 
ae 
is a non-zero Fn—1-measurable vector-valued variable with bounded component 


(l7i(w)| <c <œ, w € Q) such that 
P((an,Xn) >0|Fn-1) >0 (P-as.), 


then also 
P((an, Xn) <0|¥n-1) >0 (P-as.), 


where (Yn, Xn) is the scalar product. 

Then there exists a probability measure P equivalent to P on (Q, F) such that 
the sequence (Xo, X1,..., Xy) isad-dimensional martingale difference with respect 
to P: E5|Xnl < 00, E5Xo = 0, and E5(Xn | Fn-1) =O, l<nc<n. 


Proof. If the topological support (i.e., the smallest closed carrier) of the regular 
conditional probability P(Xn € -|¥%n—1)(w) does not lie in a proper subspace of R¢, 
then, as for d = 1, the functions 


Yn(a;w) = Efer) | F,_1)w), ae RY, 


are strictly convex and the smallest value inf yn(a;w) is attained at a unique point 
an = An(w) € R2; moreover, the functions an(w) are ¥,—1-measurable. 

The case of a regular conditional distribution P(X» € -| ¥n—1)(w) concentrated 
at a proper subspace of R is slightly more delicate. As shown in [407], we can find 
in this case a unique ¥,~;-measurable variable an = an(w) delivering the smallest 
value inf pn (a; w). 

The required measure P can be constructed as in (15) and (14) if we treat 
an(w)Xp(w) (in (14)) as the scalar product of the vectors an (w) and Xn(w). 


6. The above construction of a martingale measure based on the Esscher condi- 
tional transformation gives us only one particular measure, although the class of 
martingale measures equivalent to the original one can be more rich. We devote the 
next section to several approaches based of the Girsanov transformation that can 
be used in the construction of a family of measures P equivalent to or absolutely 
continuous with respect to P such that the sequence of the discounted prices is a 
martingale with respect to these measures. 

The Esscher transformation has long been in use in actuarial calculations (see, 
e.g. [52]). ‘Esscher transformations’ are less familiar under this name in financial 
mathematics (see nevertheless the already mentioned papers [177] and [178}), 
where several I. V. Girsanov’s results on changes of measures known as ‘Girsanov’s 
theorem’ play an important role. 
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In fact, these transformations have very much in common and we shall discuss 
this in detail in §3 of this chapter. Here we only mention in connection with 
Lemma 1 that if X is a normally distributed random variable (X ~ (0, 1)), then 


pla) = Epe®X = €2 and (see (7)) 


eat 


Za(z) = 
7) 


A reader acquainted with Girsanov’s theorem will see at once that the ‘Girsanov’ 

B32 
exponential e?”~ 2% 
form (7). 


involved in this theorem (see § 3a) is just the Esscher trans- 


§ 2e. Extended Version of the First Fundamental Theorem 


1. Let A(P) and Ploc(P) be the sets of all probability measures P ~ P such that 


the discounted prices 
2 (5) 
B Bn O<nsN 


are martingales and local martingales, respectively, relative to these measures. 
Let P (P) be the set of measures P in (P) such that the Radon-Nikodym 


derivatives = are bounded above: Tiu) < C(P) (P-a.s.) for some constant C (P). 


We can formulate Theorem A (the First fundamental theorem in § 2b) as follows: 
the conditions 


(i) a (B,S)-market is arbitrage-free 
and 

(ii) the set of martingale measures (P) is not empty (A(P) # Ø) 
are equivalent. 


Theorem A* below is a natural generalization of this version of the first funda- 
mental theorem; it provides several equivalent characterizations of an arbitrage-free 
market and clears up the structure of the set of martingale measures. (We state 
and prove this theorem following [251].) 

First of all, we introduce our notation. 

Let Q = Q(dr) be a probability measure in (R¢, @(R2)) and let 


K(Q) be the topological support of Q (the smallest closed set carrying Q; 
[335; vol. 5]); 
L(Q) be the closed conver hull of K(Q); 
H(Q) be the smallest affine hyperplane containing K (Q) (clearly, L(Q) C H(Q)); 
L°(Q) be the relative interior of L(Q) (in the topology of the hyperplane H(Q)). 
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For instance, if Q is concentrated at a single point a, then H(Q) coincides 
with this point and L(Q) = L°(Q) = {a}. Otherwise H(Q) has dimension be- 
tween 1 and d. If the dimension of H(Q) is 1, then L(Q) is a closed line segment 
and £°(Q) is an open interval. 


Let Qn(w, -) and Qn (w, -) be the regular conditional distributions 


P(ASn € -|¥n—1)(w) and p(a(2) E | Fn-1 )(), Lon. 


We note that since Bn > 0,0 < n < N, and the Bn are Fy—1-measurable, the 
sets K(Q), L(Q), and L°(Q) are the same for Q = Qn and Q = Qn. Hence we shall 


assume without loss of generality in the statements and proofs below that Bn = 1, 
n<N. 


THEOREM A* (an extended version of the First fundamental theorem). Assume 
that the conditions of Theorem A are satisfied. Then the following assertions are 
equivalent: 


First of all, we make several general observations concerning the above asser- 
tions. 

As regards the definitions of arbitrage-free and weakly or strongly arbitrage-free 
markets, see § 2a. 

It follows by the lemma in Chapter II, § 1c that, in fact, Y(P) = Pioc(P), and 
therefore 

c) = d): 

Further, if the properties a), a’), a”, c), or e) hold with respect to some measure 

P equivalent to P, then they also hold with respect to P. If b) holds with respect 
7 = dP 

to a measure P ~ P (i.e., 2 (P) # Ø) and the derivative JP is bounded, then b) 
holds also for the original measure P. A 

We observe next that we can always find a measure P ~ P such that all the 


variables Sn, n < N, are integrable with respect to it and the derivative a is 
bounded. For instance, it suffices to set 
d N 
dP = Cexp(- Ss Xo ISh| ap, 
i=1n=1 


where C is a normalizing coefficient. 
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Hence we can assume throughout the proof of the theorem that E|S,| < œ, 
n < N, for the original measure P. 
Then, in view of the obvious implications 


a”) => a) = a’) 


and 

b) = oc), 
we must prove only that 

a’) => e) = b) 
and 
uw 

co) => a). 

2. We introduce now several concepts that are necessary for the proof of these 


three implications and state two auxiliary results (Lemmas 1 and 2). 
Let Q = Q(dr) be a measure in (R4, @(R%) such that 


f |r] Q(dr) < œ. (1) 
Re 


(we shall set Q to be equal to the regular conditional distributions Qn (w, dz) with 
n < N in what follows, and (1) will hold (P-a.s.) in view of the assumption that 
E{S,| < œ for n < N.) 

Let 2! = (z,v) € E = R? x (0,00) and let Q’ = Q'(dz') be the measure on the 
Borel c-algebra & of E associated with Q = Q(dz) in the following sense: Q is the 
‘first marginal’ of Q’, that is, Q(dr) = Q’ (dz, (0, 00)). 

Let Z(Q’) be the family of positive Borel functions z = z(z,v) in E such that 


f |z|z(z,v)Q'(dr;dv) < 00, (2) 
E 


i z(z,v)Q'(dr; dv) = 1, (3) 
E 


and the functions vz(z,v) are bounded. 
Let B(a,£) be the closed ball in R¢ with center at a and of radius e. Let G be 
the collection of all families 


g= (k, (ai, Ei, 04, Q; )i=1,..k) (4) 
such that k € {1,2,...}, a; € RÍ, e; > 0, a; > 0, a; > 0. 
With each g € G we associate the positive function 


k 


lil B(a;,0,) (2) + al Be(a,,€,)(2)| (5) 
Si 


1 


NEANS max(v, 1) : 


(on E), where B° = Rf \ B. 
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If Eqzg = 1, then (1) shows that zg € Z(Q’). Thus, we can define for such 
functions zg the vectors 


(a) = | zole, v) Q (de, do) (6) 


(the barycenters). 
We set 


S(Q') = {p(g): g € G, Eqrzg = 1}. (7) 


LEMMA 1. We have the following relations: 


L(Q) = #(Q’), (8) 
L°(Q) E ®(Q’). (9) 


If 0 # L°(Q), then there exists y in Rİ such that 


Q(z: (¥,2)>0)=1 and Q(x: (y.s) >0) > 0. (10) 


Proof. First we shall prove that L(Q) C ©(Q’). 

Let y € K(Q). We claim that there exists a sequence yn in ®(Q’) such that 
yn — y. In other words, each point y in the topological support of Q is the limit of 
some sequence yn = 9(9n), where gn E€ G, n > 1. 

Let An = B(y, +) be a ball of radius 1/n with center at y. We set 


an = Q” (An), an = Q” (AR) 


in = f z Q” (dz), = | r Q" (dr), 
An AS, 


where i 
"a= [ ————— Q'(dr,dv), A€ &(R®). 11 
A= | lagy ld), Ae BRS) (11) 
a£ 
We choose ôn such that ônan + = = 1. 
T 
Since y € K(Q), it follows that an > 0. It is also clear that sup, a), < œ. 
A fortiori, ôn > 0 (for large n at any rate). 
Consequently, if gn = (1, (y, 1 on, +)), then 


a£ 
EQ Zgn = Înân + ae =] (12) 


for n large. 
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bc 
We set yn = 9(9n). Then yn = bnbn + —*, where sup |b5| < 00. 
n 


Since fnan > 1 and bn — yan > 0, it follows that yn —> y, and therefore K(Q) 
lies in the closure 6(Q’) of the set ©(Q’): 


K(Q) c &(Q’). (13) 
Now, note that ©(Q’) is a convex set. Hence it follows from (13) that 
L(Q) ¢ BQ). (14) 


On the other hand each point y € ©(Q’) is the barycenter of a probability 
measure equivalent to Q (see (6), (7), and (11)), so that y lies in the closure of the 
convex hull L(Q) of K(Q). 

Hence P(Q’) C L(Q), and therefore P(Q’) C L(Q); together with (14) this shows 
that L(Q) = 6(Q’). 

We now use the fact that a convex set contains the interior of its closure. 
For the convex set ©(Q’) (considered in the topology of H(Q)) this means that 
L°(Q) € V(Q’). 

Thus, we have proved (8) and (9). 


We procced now to the proof of the second assertion of the lemma. 

Let 0 ¢ £°(Q). Then there are three possible cases, which can be treated using 
the standard separation machinery of convex analysis; see, e.g., [406]. 

The first case: 0 ¢ H(Q). In this case let y be a vector directed towards the 
affine hyperplane H(Q) and orthogonal to it. Then (y, x) > 0 for z € H(Q), which, 
of course, means that Q(z: (y,z) > 0) = 1. 

The second case: 0 € H(Q), but 0 ¢ L(Q). Then there clearly exists a vector 
y E€ H(Q) such that (y, x) > 0 for all z € L(Q), so that Q(z: (7,2) > 0) = 1. 

The third case: 0 € H(Q), but 0 € L(Q) \ £°(Q). Then both subsets £(Q) and 
K(Q) of the plane H(Q) of some dimension q lie to one side of some (q — 1)-di- 
mensional hyperplane H’ of H(Q) containing 0. (If q = 1, then H’ is reduced to 
the point {0}.) By definition, H(Q) is the smallest affine plane containing K(Q). 
Hence K(Q) does not lie in H” and y can be constructed as follows. 

We consider an arbitrary non-zero vector y in H(Q) that is orthogonal to H and 
satisfies the inequality (y, x) > 0 for each z € L(Q), so that Q(z: (7,2) > 0) =1. 

Next, we note that there exists z E€ K(Q) such that (y,2) > 0. Bearing in mind 
that K(Q) is the topological support of Q we see that Q(z: (7,2) > 0) > 0. This 
completes the proof of Lemma 1. 


3. The next result that we shall require relates to the problem of the existence of 
a ‘measurable selection’. 
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LEMMA 2. Let (£,&,) be a probability space and let (G, 8) be a Polish space 
with Borel o-algebra 4. Let A be a E © -measurable subset of E x G. Then there 
exists a G-valued &/@-measurable function (a selector) Y = Y(x), z € E, such 
that (x, Y (x)) € A for p-almost all z in the E-projection 


n(A) = {x: (x,y) € A for some y € G}. 


Remark 1. Here we mean by a Polish space a separable topological space that can 
be equipped with a metric consistent with the topology and making it a complete 
metric space. 

In the literature devoted to measurable selection (see, e.g., [11]) one can find 
various versions of results on the existence of measurable selectors under various 
assumptions about the measurable spaces (E, €) and (G, 4). The most easy way for 
us is to refer, e.g., to [102; Chapter III, Theorem 82] or [11; Appendix I, Theorem 1], 
where one can find the following result (in a slightly modified form). 


PROPOSITION. Let (£,&) be an arbitrary measurable space and let (G,Y) be a 
Polish space. If A € &€ OG, then there exists a universally measurable function 
Y =Y(x), z € E, such that (r,Y(x)) € A for all x € (A). 


(We recall that a G-valued function Y= -¥(z) in (E, £) is said to be universally 
measurable if it is &/G- measurable, where & = Nn 6, is the intersection of all the 
g-algebras &,, that are the completions of & with respect to all finite measures u 
n (£,6€).) 

Thus, it follows from the above proposition that there exists a & /4-measurable 
function Y = Y (x), z € E, such that (x, ¥(x)) € A for all x € x(A). 

Since 6 = Np u the function Y is, of course, & ./4-measurable for each finite 
measure p. sme the fact that (G, 4) is a Polish space and the o-algebra &,, is the 
completion of € with respect to p, it is easy to conclude (approximating Y by simple 
functions) that there exists a €/G-measurable function Y such that Y = Y (p-a.s.). 

Hence Lemma 2 is a consequence of the above proposition. 


4. Proof a’) => e). Assume that the market is arbitrage-free, but e) fails. Then 
there exist n € {1,2,...,N} and a ¥,_1-measurable set B with P(B) > 0 such 
that 0 lies outside L° (Qn (w, -)) for almost all w € B. 

The set 


A= { (4.7) EQx R?: wEB, Qrlw, {z: (7,2) > 0}) =1, 


Qn(w, {2: (7,2) > 0}) > o} 


is Fp_1 © %(R%)-measurable and B is its projection 7(A) by the last assertion in 
Lemma 1. 
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By Lemma 2 on measurable selection there exists a #,—1-measurable vector 
g = g(w) such that 


Qn(w, {z: (g(w), £) > 0})=1, Qn(w, {z: (g(w), z) >0})>0 


for P-almost all w € B. 

Since B is a Fn-1-measurable set, the function g(w) = g(w)Ip(w) is 
Fn—-1-measurable and, in accordance with (15), (J, ASn) > 0 (P-a.s.) and 
P((g, ASn) > 0) > 0. 

We set yi = Jli=n, i = 1,...,N, and consider a self-financing strategy t = (3,7) 
of value X” satisfying the equalities XJ = 0 and AX} = (yi, AS;). (We deter- 


mine the parameters 3;—e.g., in the case of B; = 1—from the requirements that 
i 
XF = (yi, Si) + bi must be equal to $ (yj, AS;)). 
j=l 


Clearly, for this strategy 7 we have Xf = 0. XF = Ofori <n, and X7 = Xf > 0 
for all i > n; moreover, P(X%, > 0) > 0, which contradicts condition a’). 


5. Proof e) => b). We shall define the martingale measure P € #,(P) by the 
formula dP = Z dP, where 


= N 
Z= II Zn (16) 


for some Fn-measurable functions Zp. 
We consider the regular conditional probability 


Qn(w, dz) = P(ASn € dr | Fn—1)(w), 


which is also called the transitional probability (from (Q, Fn—1) to (R, A(R2))). In 
what follows, we construct from Qn(w, dz), by means of a special procedure, transi- 
tional probabilities Q/, (w, dz,dv) (from (Q, Fn-1) to (E, £) with E = R? x (0,00)) 
such that 

Qi (w, dr, (0, 00)) = Qn(w, dz). (17) 


We now return to our discussion in subsection 2 and set Q’(dz, dv) there to be 
equal to the measure Qf, (w, dz, dv) (n = 1,..., N, w E Q). 
We also set (see (5)-(7)) 


A= COREG f 7E Qh yar, do) = 1, (a) =o}. 


Note that the above-introduced space G of elements g defined by (4) is a Polish 
space (with product topology). Let @ be its Borel o-algebra. 

The set A is ¥,_1 © G-measurable, and its projection 1(A) is equal to Q by 
assumption e). 
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Hence, by Lemma 2 there exists a G-valued #,,_ ;-measurable function gn=gp,(w) 
such that (w,gn(w)) € A for P-almost all w € Q. 
We now consider the function 


Zn(w, T, 0) = Zg, (w) (Tv) (18) 


in Q x E. It is Fn-1 @ &-measurable, and (P-a.s.) we have 
l zn(w, z, v)Q'(w,dr,dv)= 1 (19) 
E 


and 


| t2n(w, z, v) Q'(w,dr,dv) = 0. (20) 
E 


By (5), 


sup vZn(w, £, Vv) < 00 
v 


P-almost surely. 
Using Lemma 2 again we see that there exists a ¥,_1-measurable positive func- 
tion V,-1 = Vn-1 (w) such that 


vin(w,2,v) < Va~1(w) (21) 


for all (z,v) and P-almost all w € Q. 

We proceed now to a construction (by backward induction) of the sequence 
of measures Qf, = Q/,(w,dz,dv) and the corresponding sequence of functions 
zn(w,t,v),n=N,N—1,.... 

To this end we set n = N and Vy(w) = 1, w E Q. Let Qh, = Qp (w, dz, dv) 
be the regular ¥,,~1-measurable conditional probability of the vector (ASN, Vy). 
Clearly, the ‘first marginal’ of this measure is precisely Qy = Qy (w, dz). 

Let zy(w,z,v) be a function associated with QN in accordance with the con- 
struction (18) and let Vy_1(w) be defined by zy (w, z,v) in accordance with (21). 
Then we define QN-1 as the regular F y—z2-measurable conditional distribution of 
the vector (ASn-1, Vy-1). In general, given Q/,, we define z,(w,z,v) in accor- 
dance with (18) and V,-1(w) in accordance with (21). 

Next we set Qhi to be equal to the ¥,_2-measurable conditional distribution 
of (AS,;~1, Vn—1) and so on. 


We now set 
Zn(w) = Zn (w, ASn(w), Valw)) (22) 
and 
a N 
Zw) = [| Aw). (23) 


n=1 
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Then, in view of (21) and (22), 


~ Vn-1 (w) 
< ———. 
Znlw) S Vlo) (24) 
By (23), (24) and bearing in mind that Vy (w) = Wyn (w) = 1 we obtain 
Z(w) < Volw), (25) 


where Vo(w) < œ (P-a.s.). Since Fo = {Ø, Q}, the function Vo(w) is constant 
P-a.s.). 
A frorn the definition of Qf, and conditions (19) and (20) we see that 
(P-a.s.) 
E(Zn | Fn—1)(w) =1 (26) 
and 
E(ASnZn | Fn-1)(w) = 0. (27) 
By (26) we obtain that EZ(w) = 1, and therefore we can define a new probability 
measure P by setting 


P(dw) = Z(w) P(dw). (28) 
It is clear from (25) that P ~ P. 
Since 7 = 
E|AS,| = EZ|ASp| < VoE|ASn| < co 
and 


E(ASn|¥n—1) = E(ASnIn | Fn-1) = 0 
by Bayes’s formula (see formula (4) in § 3a) and (27), the sequence S = (Sp) is a 
P-martingale. 


By the above construction of the measure P we obtain that P € Y,(P). This 
proves the implication e) => b). 


6. Proofe) => a”). Let Pe Y(P) # Ø. Then S = (Sp)ncn isa P-martingale. 
Let 7 be a self-financing strategy such that Xf = 0 and Xj, 2 0. 

Since AX], = Yn ASn, the sequence X” = (X7 )ngn is a martingale transform, 
and therefore X” is a martingale by the lemma in Chapter II, § 1c. Hence EX, = 
EX = 0, so that Xf, = 0 (P-a.s.), which proves the required implication. 

The proof of Theorem A* is complete. 

Remark 2. If we do not assume that Bn = 1, n < N, then in place of the regular 
conditional probabilities 


Qn(w, dz) = P(AS), € da | ¥n—1)(w), 
one must consider directly the regular modifications of the conditional probabilities 


Qn (w, dir) = p(a($) Edr | Fur) (0) 


n 
The argument, in which one must bear in mind that the Bn, n < N, are positive 
and ¥,,-1-measurable, remains essentially the same. 


3. Construction of Martingale Measures by Means of 
an Absolutely Continuous Change of Measure 


§ 3a. Main Definitions. Density Process 


1. Let (Q, F.(Fn)nz1,P) be a filtered probability space. Then we say that a 
measure P in (Q, F), is absolutely continuous with respect to P (we write P < P) 
if P(A) = 0 for each A € ¥ such that P(A) = 0. 

We say that measures P and P in the same measurable space (Q, F) are equiv- 
alent (we write P ~ P) if P< P and P <P. 

In many cases the condition of absolute continuity or equivalence is too restric- 
tive or simply redundant: one can do with a weaker condition of local absolute 
continuity that can be explained as follows. 

Let Pn = P| Fn be the restriction of the probability measure P to the o-algebra 
Fn C F. (In other words, Pp is a measure in (Q, Fn) such that 

Pn(A) = P(A) 
for each A € Fn.) Then we say that a measure P is locally absolutely continuous 
Si 
with respect to P (we write P < P) if 
Pn < Pr 
for each n È 1. 

Two measures, P and P, are said to be locally equivalent (we write P a P) if 
~l loc ~ 
P< PandP <P. 

If, e.g., Q is the space R, the coordinate space of sequences w = (41, 2%2,..-), 
Fn = o(w: 21,...,£n) is the o-algebra generated by the first n coordinates func- 
tions, F = A(R), and P, P are probability measures in (Q, F), then the relation 


~ loc 
P x P means precisely that their corresponding finzte-dimenszonal probability dis- 


tributions are absolutely continuous. 
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Note that if we have n < N < oo, then local absolute continuity is the same 
as absolute continuity. Thus, the ‘local’ concept can be of interest only for models 
with time parameter n € N = {1,2,...}. 

Besides the restrictions Pn = P| Fn of the measure P to the o-algebras Fn we 
shall also consider the restrictions P} = P |F, of P to the o-algebras F, of sets 
A E F such that {r(w) Sn} NA E Fn for each n > 1. We emphasize that, as 
usual, Fog = F (and Fæ- = V Fn); see [250; Chapter I, § la]. 


Zs) oe 
2. Let P < P. Then Py < Pn for each n € N and there exist (see, e.g., [439]) 
Radon-Nikodym derivatives denoted by 


and defined as #,-measurable functions Zn = Zn(w) such that 


n(A) = f, Zalo) Pr(dw). AEF. (1) 


Remark. The Radon-Nikodym derivative an is defined only up to P,,-indistin- 


n 
guishability, i.e., if (1) holds for two functions, Z,(w) and Z} (w), then 
P(Zn(w) # Zn(w)) = 0. 


Both Zn(w) and Z} (w) are ‘representatives’ of the Radon-Nikodym derivative 


: . Í P 

in this case. Saying that ‘Zn = P 
n 

that we have chosen one representative that we shall consider in what follows. It is 


easy to see that we can choose versions such that not only P(Zn(w) > 0) = 1, but 
also Zn(w) > 0 for all w € Q and each n > 1. It is for this reason that one usually 
includes the non-negativity in the definition of the Radon—Nikodym derivative of a 
probability measure. 

In what follows, we shall call the discrete-time process 


is the Radon-Nikodym derivative’ we mean 


Z= (Zn)n>1; 


the density process (of the measures Pn with respect to the Pn, n > 1, or of the 
measure P with respect to the measure P). 

For the convenience of references we combine in the next theorem several simple 
but useful and important properties of the density process. 
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~l 
THEOREM. Assume that P < P. Then we have the following results. 


a) The density process Z = (Z,) is a non-negative (P,(¥n))-martingale with 
EZ, = 1. 


b) Let F = Fæ—, where Fo— = VW ¥n. Then the following conditions are 
equivalent: 

(i) P «P; 

(ü) Z = (Zn) is a uniformly integrable (P,(¥n))-martingale; 

(iii) P (sup Zn < 00) =I 

n 

c) Let r = inf{n: Zn = 0} be the first instant when the density process vanishes. 

Then it ‘remains at the origin indefinitely’, i.e. 


P{w: Zn(w) #0 for some n > T(w)} = 0. 


d) Let r be some stopping time. Then the restrictions P =P | Fr and P; = 
P| F; to the o-algebra F, (see Definition 2 in Chapter II, § 1f) satisfy the relations 
P+ < P> and 


dP, 
aera 2) 
e) We have the equality 
P (inf Zn >0) =i (3) 


loc ~ 
f) If P(Z_ > 0) = 1 for each n > 1, then P < P and 


5 loc 


P'S p, 


Proof. a) By (1), 


P,(A) = E(L4Zn) = Pngi(A) = ECA Zn41) 


for A € Fn, and therefore El4Zp = El4Zn41. Hence E(Zn41| Fn) = Zn (P-a.s.) 
for each n > 1. It is also clear that EZ, = Pn(Q) = 1. Thus, Z is a (P,(¥n))- 
martingale. 

b) (i) == (ii). We recall the following classical result of martingale theory ([109]; 
see also Chapter III, § 3b.4 in the continuous-time case). 
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DooB’s CONVERGENCE THEOREM. Let X = (Xn) be a supermartingale with re- 
spect to the measure P and the flow (Fn) such that there exists an integrable 
random variable Y such that Xn > E(Y | Fn) (P-a.s.) for each n > 1. 

Then Xn converges (P-a.s.) to a finite limit Xoo. 


(The proof can be found in many handbooks, e.g., [439; Chapter VII, § 4].) 

To prove the implication (i) => (iii) it suffices to observe that since Zn > 0, 
it follows by this Doob’s theorem that there exists with P-probability one a finite 
limit lim Zn. However, P < P, therefore with P-probability one there also exists a 


finite limit, so that P (sup Za < 2c) Sail), 
mr 


(ii) ==> (ii). The uniform integrability of a family of random variables (£n) 
means that 


lim sup E(lénlZ(lén] > N)) = 0. 


In our case (where En = Zn), by (iii) we obtain 
E(Zn1(Zn> N)) =P(Zn > N) < P (sup Zn > N) +0 as N>, 
n 


which proves (ii). 

(ii) => (i). By Doob’s theorem, Zn > Zoo (P-a.s.). Hence the uniform 
integrability of the sequence (Zn) ensures also the convergence in L1(Q, F, P), i.e., 
E|Zn ~ Zoo| > 0 as n > œ. 

For sets A € Fm and for n > m we have 


P(A) = El4 Zm = El4 Zn. 
However, E] Zn — Zæ! > 00, and for each A € Fm we obtain 
P(A) = E4 Zæ. 
Hence, using the standard techniques of monotonic classes ([439; §2, Chapter IT]) 
we conclude that the same equality holds in U Fn and in F = o(U Fn) (= F-œ = 


V Fn). 


Thus, P&P and, moreover, 


where Zæ = lim Zn. 
c) To prove this property we require another classical result of martingale theory 
([109]; see Chapter III, § 3b.4 again for the continuous-time case). 
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Doos’s OPTIONAL STOPPING THEOREM. Let X = (Xn) be asupermartingale such 
that 
Xn > E(Y | Fn), n>1 


for some integrable random variable Y. 
Then the variables Xo and X+ are integrable for any two Markov times o and T, 
and 


E(X,|F%a) < Xo (P-a.s.) 
in the set {a < T}. 
(See the proof, e.g., in [109] or [439; Chapter VII, § 2].) 


Remark. On {w: T(w) = co} we set the value of X;(w) to be equal to Xoo(w), the 
limit of the X,(w), which exists by Doob’s convergence theorem. 

Besides r = inf{n > 1: Zn = 0} we shall consider the times om = inf{n > 1: 
Zn > 1/m}. It is easy to see that T and om are stopping times, i.e., the sets {T < n} 
and {om <S n} belong to Fn for each n > 1 and all m > 1. (We recall that, as 
always, T(w) = œ if Zn({w) > 0 for all n > 1.) 

By Doob’s optional stopping theorem, 


E(Zom |F) < Zr =0 on the set {w:rT(w)< œ}. 


Hence Zom I(T < œ) = 0, m > 1, so that om = œ (P-a.s.), m > 1, which precisely 
means that 
P{w: In > T(w) with Z,(w) #0} =0. 


d) Let A € Fr. Then 


E[ZA : lrcoo} ` Zr] = X Eaten Z7] = J Ela Tea Sal 


nèl n>l 
= Y P(AN I(r = n)) = P(An{r < œ}), 
nèl 


which proves the required result. 
e) We set 
; 1 
Tm = inf {n: Zn < mae 
Then i 
P(Tm < œ) = E(Zrmn I (in, < œ)) < A 
by d), and therefore 


P(Mrm < 2} ) =0, 


which is equivalent to the required equality P (inf Zn > 0) =1. 
n 
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aT ae 
f) If P < P and P(Zp > 0) = 1, n > 1, then also P(Zp > 0) = 1. 
For A E€ Fn we set 


Qn(A) = f Z7! P(dw). 
Then P,,(dw) = Zn Pn(dw), and therefore 


Qn(A) = f Za’ Zn P(dw) = Prd), r 


Wv 
E 


Hence 
Pa(4) = / Za 1 By (dw), 
A 


loc ~ 
so that P < P. 
Thus, we have proved all assertions a)~f) of the theorern. 
3. The following ‘technical’ result is useful in the considerations of conditional 
expectations with respect to different measures. We shall use it repeatedly in what 


follows and call it the ‘conversion lemma’. Formula (4) is often called ‘(generalized) 
Bayes’ formula’ ([303; Chapter 7]). 


LEMMA. Let Pn < Pn and let Y be a bounded (or P-integrable) %,-measurable 


random variable. Then for each m < n we have 


EY | Pm) = ZEY Zn | Fm) (Pas) (4) 


Proof. For a start we observe that P(Zm > 0) = 1 (see assertion e) in the above 
theorem). Further, we also have Z,(w) = 0 (P-a.s.) in the set {w: Zm(w) = 0} for 
n > m. Taking this into account we shall assume that the right-hand side of (4) 
vanishes in this set. 

By definition, E(Y | Fm) is a #m-measurable random variable such that 


E[14-E(Y | Fm)] = Ela- Y] (5) 


for each A E€ Fm, so that we need only verify that for the Fm-measurable function 
on the right-hand side of (4) we have 


a 1 = 
E|t4- gE Zal Fm)] = Ella Y1. (6 
m 
In fact, this follows from the following chain of equalities: 


E 


1 
Ta) FEW Zal Fm) =E 


m 


1 
Ta: z—E(Y Zn | Fm) ‘Em 
Zm 


i 


E[ZA - E(Y Zn | Fm)] 


~ 


a) 


Seiya)” EY ear 
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where (a) holds by the definition of the conditional expectation E(Y Zn | Fm) and 
(8) is a consequence of the ¥-measurability of I4Y and the fact that Z = (Zn) 
is a martingale. 


§3b. Discrete Version of Girsanov’s Theorem. 
Conditionally Gaussian Case 


1. It is reasonable to start our discussion of the construction of the probability 
measures P, that are (locally) absolutely continuous or equivalent to the original 
basic measure P involved in the definition of the filtered space 


(Q, F, (Fn)nz1: P), 


with a discrete (with respect to time) version of a result established by I. V. Gir- 
sanov [183] for processes of diffusion type, which became the prototype for a number 
of results for martingales, local martingales, local measures, semimartingales, and 
so on. (See, e.g., [250; Chapter II].) 

Let n > 1 be the time parameter and let £ = (£1,£2,...) be a sequence of 
Fn-measurable random variables £p with distribution 


Law(en | Fn—1;P) = M (0,1). (1) 


In particular, this means that € is a sequence of independent standard normally 
distributed random variables, €n ~ (0, 1). 

Assume that, besides £ = (€n)n>1, we are given predictable sequences 
h = (Hn)ng1 and o = (on)n>1, ie., sequences of ¥%,j-measurable variables un 
and on (Fo = {Ø,9Q}). Moreover, we assume that on > 0, which is based on the 
interpretation of this parameter as ‘volatility’ and the fact that we can simply leave 
observations with on = 0 out of consideration. 

We set h = (hn)n>1, where 


hy = Hn + OnEn. (2) 


By (1) we obtain that the (regular) conditional distribution P(n < - | Fn-1) can 
be defined as follows: 


1 T _ (yun)? 
Phn < £| Fn- =-=] e 2on dy, 3 
( n |Fn-1) Janet a y ( ) 
or, symbolically, 
Law (hn | Fn-1; P) = N (Hn, 02), (4) 


which allows one to call h = (hn) a conditionally Gaussian sequence (with respect 
to P) with (conditional) expectation 


E(hn | Faci) = Hn (5) 
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and variance 


D(hn | Fn—1) = 02. (6) 

We see from (4) that 
E(hn — En | Fn—1) =0, (5’) 
D(hn — Hn | Fn—1) = 02. (6’) 


Setting 


n n n 
Hn=X_ hy, An = Sop, and Mp= So One, 
k=1 k=l k=1 


we can say that the variables Hn can be represented as follows in the conditionally 
Gaussian case: 
Hy = An + Mn, 


where A = (An) is a predictable sequence and M = (Mn) is a conditionally Gaussian 
martingale with quadratic characteristic 


n 


We set Wn = } £k, AWn = En, AHn = hn, and A = 1. Then we can 


rewrite (2) in the difference form as follows: 
AHn = [ind + op, AW), 
which one can regard as a discrete counterpart to the stochastic differential 
dH; = pu dt + ot dW; 


of some It6 process H = (H+) generated by a Wiener process W = (Wi)t>0, with 
local drift (4¢)¢>0 and local volatility (oz)zs0 (see, e.g., (303; Chapter 4] and Chap- 
ter III, § 3d). 

In the present case of a conditionally Gaussian sequence (2) this discrete analog 
of Girsanov’s theorem (which, as already mentioned, was proved by I. V. Girsanov 
in the continuous-time case) has a relation to the question of the existence of a 
measure such that P is absolutely continuous or equivalent to the measure P and 
the sequence h = (hn) is a (local) martingale difference with respect to P. It is 
worthwhile to point out in this connection that the right-hand side of (2) contains 
two terms: the ‘drift’ Hn and the ‘discrete diffusion’ CnEn, which is a martingale 
difference (with respect to P). The meaning of the above question lies in the 
existence of a measure P < P such that (hn) has no ‘drift’? component with respect 
to P < P and reduces to ‘discrete diffusion’, i.e., is a (local) martingale difference. 
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2. Our construction of the measure P is based of the sequence of (positive) random 


variables 
Ti: 


7m 
Lk 1 Hk \? 
Za = exp} a D] \ n>. (7) 
k=1 k=1 


LEMMA. 1) The sequence Z = (Zp)np1 is a (P,(¥n))-martingale with EZ, = 1, 
n>. 
2) Let F = \V Fn and assume that 


cow ($ 5o (#)°) <œ (8) 


(the ‘Novikov condition) 
Then Z = (Zn)n>ə1 is a uniformly integrable martingale with limit (P-a.s.) 
Zoo = lim Zp, such that 


Zoo = eX _ 57 ae zi 3 Hi)? (9 
oo = exp Tp * ule) ) ) 
k=1 k=1 


and 
Zn = E(Zæ | Fn). (10) 


Proof. 1) This is obvious since for each k > 1 we obtain 


Eexp{ fter- (EF | Fab =i (11) 


Tk 


by the ¥;,__j-measurability of the a and conditionally Gaussian property (1) (here 
k 
Fo = {2, (2}). 

2) The proof that the family (Zn) is uniformly integrable if (8) holds is fairly 
complicated and can be found in the original paper [368] by A. A. Novikov as well 
as in many textbooks (see, e.g., [303; Chapter 7] or [402]). 

However, this proof becomes relatively elementary once one imposes a stronger 
condition: there exists € > 0 such that 


eom{(5+2) ($8) } < %. (12) 


Hence it would be reasonable to present this proof, which is what we do at the end 
of this section (see subsection 5). 
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3. We fix some N > 1 and shall consider the sequence (hpn) only for n < N. For 
simplicity we shall assume that F = Fy, so that Py =P|¥y =P. 
Since Zy > 0 and EZy = 1, we can define a probability measure P = P(dw) in 
(Q, F) by setting 
P(dw) = Zy(w) P(dw). (13) 


We emphasize that we do not only have the relation P< P, but also P < P. Hence 
P~P. 
We consider now the properties of the sequence (hn)n<n with respect to P. 
By Bayes’s formula (4) in §3a, for each à € R and n < N we obtain 


~; Aon — Hn Kun ~2 (Hn)? 
E(e™"n |Fn-1) ae E(e(e = Jent+irun 5 (42) Fai) 
= E( (idon Bt) en 3 (ion HE)” 
1 2 1 
Jea (idon — #2) +irApn— | Fn 1) 
202 
=e 2 (14) 
(P-a.s.), where we use the equality 
Ee (iAon—22 en} (idon £2)’ zi 
and the fact that the o2 are ¥,_1-measurable. 
We obtain the equality 
mes š d2 62 ~ 
E(e"|F,1)=e° 2 (P-as.), (15) 
which means that the sequence h = (hn) remains conditionally Gaussian with 


respect to this new measure P, but has now the trivial ‘drift’ component: 
Law(hn | Fn-1; P) = (0, 02), (16) 
so that we obtain the following analogs of (5) and (6) in this case: 


E(hn | Fn—1) = 0, (17) 
D(hn | Fn—1) = o. (18) 


{| 


One can say that the transitions from P to the measure P eliminates (‘kills’) 
the drift u = (Hn)n<n of the sequence h = (hn)non, but preserves the conditional 
variance. 
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We can also conclude from (16) that if € = (En)ncn is a sequence of Fn-mea- 
surable variables with distribution 


Law (En | Fn—1; P) = (0, 1) (19) 


(one can always construct such a sequence, although it may be necessary to enlarge 
our initial probability space), then 


Law(hn, n < N |P) = Law(onén, n S N |P). (20) 


Hence it is clear that the sequence (hn)ngu ‘behaves’ as a local martingale 


difference (OnEn)n<N With respect to P, while in terms of the original measure P a 
property similar to (20) can be expressed as follows: 


Law (hn — Hn, n < N |P) = Law(onen, n < N |P). (21) 


We now shift our standpoint. 

We shall treat a measure P and a sequence h = (hn) satisfying (20) as original 
ones. Then the transition from P to a measure P (in accordance with (13)) gives us 
property (21), which can be interpreted as drift appearing in the local martingale 
difference (tnEn)n<Nn- This interpretation is more convenient when one is willing to 
state the corresponding general result on the transformations of local martingales 
under absolutely continuous changes of measure (see § 3d below). 

Before summarizing the results just obtained we point out the following. 

Assume that o2(w) is independent of w (= 02). Using induction we obtain 
by (14) that 


E(t Dea arhe) = E (ci Er MhuE(eDWAN | Fy_1)) 


Bcd 
a BPR EA EN Neha) = a FEL MOP, 
Thus, with respect to the measure P the sequence (hn)ngn consists of inde- 
pendent normally distributed random variables hn ~ 4 (0,02), with expectations 
zero. (We could arrive to the same conclusion on the basis of (17)—(20).) 
We now sum up the above results in the following theorem, which we could call 
the discrete analog of Girsanov’s theorem. 


THEOREM. Let h = (hn)ngn be a conditionally Gaussian sequence such that 
Law(hn | Fn-1;P) = MN Ginga, ns N. 


Let Fy = F and let P be the measure defined by formula (13) with density ZN 
defined in (7). 
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Then 
1) the sequence h = (hn)ngn is conditionally Gaussian with respect to P: 
Law (hn | Fn—1:P) = N (0,07), n SN; 

2) if the o2 = oł (w), n < N, are independent on w, then h = (hn)ngn isa 

sequence of independent Gaussian variables with respect to P: 
Law(hn |P) = ~ (0,02), n SN; 

3) if F = V Fn and condition (8) is satisfied, then properties 1) and 2) hold 
with respect to the measure P(dw) = Z3.(w)P(dw) for all n > 1, where 
Zæ(w) is as in (9). 


4. We note that the sequences (Hn) and (on) from the definition of hn (= tntonén) 
are directly involved in our construction of the measure in accordance with (13) 
and (7). For this reason and also to establish connections with our special choice 
of the values an(w) in the discussion of the Esscher transformations (see § 2d), we 


consider now the sequence of processes Z(®) = (ZO) nen defined by the equalities 


Zw) = = exp} - > bkEk — = 2 i} (22) 


where the by, = by(w) are EETA 
Since Ez) (w) = 1, we have a well-defined probability measure 


BO) (dw) = Z® (w) P(dw) (23) 
in Fy = F. 
The expectation with respect to this measure is 
EO (hn | ¥n—1) = -n - =. (24) 
on 
It is now clear why one sets bn = aun, n < N, in Girsanov’s theorem: it is under 


this choice that the sequence h = (bn dngn becomes a local martingale difference. 
Further, if Xn = Hn + £n, then 
1 
pn(ajw) = E(e®X" | %y—y) = 2-H, 
Hence 
inf Pnl; w) = pn (an (w): w) 


where an(w) = Jin, n S N. 

We used just these ‘extremal’ variables an(w) in §2d, in our construction (by 
means of the Esscher transformation) of a measure making the sequence (Xn) a 
martingale difference. 

Thus, in the present case (of on = 1) both Girsanov and Esscher transformations 
bring us to the same measure P. 
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5. We now claim that if (12) holds, then the family Z = (Z,)n>1 of the variables Zn 
defined in (7) is uniformly integrable. 


Let By = —!" Then the condition (12) takes the following form: there exests 
T 
ò > 0 such that É ce 
1 
Eel (5 + 5) a < %. (25) 
k=1 
In accordance with (7), 
n 1 n 
Zn = exp} e-z) n>. (26) 
k=1 k=1 
Assume that £ > 0 and p > 1. We set 
n 2 n 
1 p(l +e 
yP = exp{ 0 +e) D been — MET S ae) (27) 
k=1 k=1 
and 
n 
2) p(l +e)? l+e 5 
W= (H Laj (28) 


To prove the uniform integrability of (Zn)n>1 it suffices to show that 
sup EZITE < œœ (29) 
n 
for some £ > 0 (see, e.g., [439; Chapter II, §6, Lemma 3]). Since 
Zh =n wn 
it follows by Holder’s inequality that 
Ezite = EQ) yl?) < (E(w)? i’? [E (8) = [E (y) (30) 
(1/p+1/q = 1), where we use the equality 
E(w)? =1 


(see (11)). We set p = 1+6 and q = (1+4)/6, and we choose ô > 0 such that (25) 
holds. We now find £ > 0 such that 

52 
(1+ 6)(1 + 26) ` 


e(l +e) < 


(31) 
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Then 
Oa TE tolti) = | 
(Wa ) S exp z + 2(q _ 1) 2 A 
1 n 1 oO 
< epf (5 +9) > ot} <exp{ (5 +8) > a8} 
k=1 k=1 
so that 
1 (2), q] 1/4 a 2l" 
sup EZ}+E < [sup E(w )] < [E3 + 5) y By, < oO 


by (25), which proves the required uniform integrability of (Zn). 


§3c. Martingale Property of the Prices 
in the Case of a Conditionally Gaussian 
and Logarithmically Conditionally Gaussian Distributions 


1. Let (Q, F, (Fn), P), n > 0, be the original filtered probability space. In 


our discussions of martingale measures turning the sequence of discounted prices 


a -(#), where B = (Bn) and S = (Sn), into a martingale we consider first a 
n 


somewhat idealized model of a (B, S)-market: we shall assume that B, = 1, 


Sn=Sot+Hn, nèl, (1) 


n 
where Hn = J` hy and So = Const. We shall also assume that h = (hn) is 
k=1 


a conditionally Gaussian sequence: hyn = Hn + On€n, where the un and op are 
F,y,—1-measurable, and £ = (£n) is a sequence of independent, M (0, 1)-distributed, 
F,y,-measurable variables €n, n > 1 (see the preceding section for details). 

Thus, we allow the prices S, to take also negative values. (This is the ‘ide- 
alization’ we have mentioned.) Note, however, that L. Bachelier [12] considered 
precisely a model of this kind; see Chapter I, § 2a. 

Let un = 0. Then 

Sn = So + Ý Okek- (2) 


kn 


Since the og are ¥,_1-measurable and E(e, | ¥,_1) = 0, it follows by (2) that 
the sequence of prices S = (Sn) is a martingale transformation and, therefore, a 
local martingale (see the theorem in Chapter II, § 1c). Assuming additionally, that, 
eg., Elopex| < œ, k > 1, we obtain that S = (Sn) is a martingale with respect 
to the original measure P. (As regards general conditions ensuring that a local 
martingale is a martingale, see Chapter II, § Ic). 
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Assume now that the Hn do not vanish identically for all n < N. In this case 
the discrete version of Girsanov’s theorem (§ 3b) give us tools for a construction of 
measures (see formulas (8) and (6) in §3b) such that the sequences (Sp)pcn are 
local martingales with respect to Py (or even are martingales if Elonen] < œ for 
alln < N). 


2. We consider now a more down-to-earth situation where 
Sn = Soe” (3) 


in our (B, S)-market and B, =1,n< N. 

As mentioned in Chapter II, § la, the representation (3) of ‘compound interest’ 
kind is convenient for a statistical analysis but it is not perfectly convenient for the 
aims of stochastic analysis. The problem is that to verify the martingale property 
of the sequence S = (Sn) one would better have a result saying: “in order that a 
sequence S = (Sn) defined by (3) be a martingale it is sufficient that the sequence 
H = (Hn) be a martingale”. This is, however, not so in general, which explains 
why we turn to the representation 


Sn = So6(M)n (4) 
(‘simple interest’), where (see Chapter II, § la) 
Fin = Hn + $ (e^t — AH, — 1) (5) 
kSn 


and &(H) = (E(P)n)n>0 is the stochastic exponential constructed from H = (Hn) 
by the formulas 


E(M)n =e®™ T[ (0+4) n>1, (6) 
ksn 
and 6(H)o = 1. 
Of course, we could simplify the right-hand sides in (5) and (6) and write these 
equations as 
e (7) 
kgn 
and T J 
E(H)n = [[ (0+ AAg). (8) 
ken 


Still, it is useful to point out again (see Chapter II, § 1a and Chapter III, § 5c) 
that bearing in nind a similar representation in the continuous-time case, we should 
regard as ‘right’ formulas ones close to (5) and (6) and not to (7) and (8). This has 
something to do with the problem of the convergence of the corresponding ‘sums’ 
X s< and ‘products’ Ise which, generally speaking, have inifinztely many terms 
and factors for each t > 0 in the continuous-time case. 

The advantage of (4) over (3) lies in the following result. 
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PROPOSITION. In order that a sequence S = (Sn) defined by (4) be a martingale it 
is sufficient that the sequence H = (Hn)n 1 be a local martingale with AH, >-1 
forn > 1. 


Indeed, from (6) or (8) we see that 


AE(H)n = 6(M)n-14Hn (9) 


for n > 1. Assume that (Hn) is a local martingale, and therefore is a martingale 
transformation (see the lemma in Chapter II, § 1c), which admits a representation 


n 
Ay = 5 a, AM, (10) 
k=1 
with ¥,_.;-measurable a, and some martingale M = (Mn). 
From (9) and (10) we see that 


AE(H)n = an6(H)n—-14Mn, 


ie.. 6(H) is a martingale transformation and. therefore, a local martingale. 

If AH > —1. then. surely, 6()n > 0. Hence. by the lemma in Chapter II, § 1c 
the local martingale &(H) is actually a martingale. 

The condition AH, > —1 holds in our case because 


AA, = eô» -1 > -1. 


3. We shall assume that the An = AH, are conditionally Gaussian variables 
with hn = Un + Onén. Then it is natural to say that the sequence S = (Sy) 
with Sn = Soe” is logarithmically conditionally Gaussian, as in the title of § 3c. 

First, we consider the question of the conditions ensuring that the sequence 
S = (Sn) is a martingale with respect to the original measure P. 

We have already seen that it is sufficient to this end that the sequence A= (An) 
with AH, = e®Ħn — 1 be a local martingale. i.e. E((Afn||Fn-1) < x and 
E(AAn | Fn—1) = 0. or. equivalently. 


Efer] Faai) =1 (P-a.s.). (11) 
Since we assume that AH, = un + nEn. we can rewrite condition |¢1) as follows: 
Efer tInEn | Fai) =L. (12) 

which is equivalent to the relation 


Ele" | Fp 1) =e Mn, (13) 
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F r Lae “oe 
The left-hand side here is equal to e277. Thus, we arrive at the condition 


o2 
lin + a =0 (P-a.s.), n >l, (14) 


ensuring that the logarithmically conditionally Gaussian sequence 


n 
DOEL mE 


Sn = So exp} 
k=l 


is a martingale with respect to P. Of course, this is what one could expect because 


the sequence 
n o? 
(E ea 0 
k=1 n2 


is a martingale, as already mentioned. 


4. We now proceed to the case when (14) fails. 
Assume that n < N. We shall construct the required measure P on Fy = F 
by means of the conditional Esscher transformation, in the following form: 


P(dw) = Zy (w) P(dw) 
with Zy(w)= [| zn(íw) and 
anhn 


€ 


c 16 
E(ethn | Fyn)’ (16) 


Zn(w) = 

where we shall choose the ¥;,_;-measurable variables ay, = ap (w) (here Fo = {9,Q}) 
such that the sequence (Sn)ngn is a (P, (Fn))-martingale. 

In our case, when the prices are described by formulas (3), this means that, with 


necessity, 
Efelent1)hn | Fpa] = Efer | Fai]. (17) 


Bearing in mind that hn = Hn + Onén, we see that equality (17) holds if 


o2 2 
Hn + F Ea A (18) 
1.e., 
_— en 1l 


If (14) holds for all n < N, then a, = 0 and Zy = 1, i.e., P=P. 
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Choosing an in accordance with (19) we obtain 


2 2 
E (efra |Fn-1) = exp} Hn on \ 


202 F 8 
Thus, 
anħn 2 
a= R =e] G i e hen les y =) } 
and 
N big On, 1/Un On 3 
Zy = exp} ies ent (E+ z) Ih. (20) 


Hence the sequence S = (Sn)ngn with 
Sn = Soe", Ay =hytess thn, bn = Unt Onén, 


isa P-martingale with ES,, = So, and the density Zy of the measure P with respect 
to P is described in (20). If (14) holds for n < N, then P = P and (Sn)ngn is a 
(P, (¥n))-martingale, a martingale with respect to the original measure. 


§3d. Discrete Version of Girsanov’s Theorem. General Case 


1. We have already mentioned that the discrete version of Girsanov’s theorem for 
the conditionally Gaussian case had paved way to similar results on stochastic 
sequences H = (Hn), where hn = AH,, can have a more general structure than the 
one suggested by the representation hn = Hn + OnEn- 

To find generalizations of a ‘right’ form we shall analyze our proof of the impli- 
cation 


oil Z 
P<P and Eļhn|Fn-1)= un => E(hn|Fn—1)=0, n21 


(established in the conditionally Gaussian case) once again. This proof in §3b was 

based to a considerable extent on the conversion formula for conditional expec- 
~ | 

tations ((4) in §3a), which has the following form for P < P, Y = H, (where 

E|H,| < œ), and m =n — 1: 


1 
Zn-1 


E(Hn|¥n—1) = E(HnZn|Fn—1) (P-as.), nèl. (1) 


Here E is averaging with respect to P and the right-hand side is set equal to zero if 
Zn—1(w) = 0. We also set Fo = {Ø, Q} and Zo(w) = 1 in what follows. 
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We claim that one can easily deduce from (1) the following equivalence: 


af << 
if PP, then HEMP) —> HZ€ MP), (2) 


where (P) and (P) are the sets of martingales with respect to the measures 
P and P (see Chapter II, § 1c). 

Indeed, if H € MP), then E(Hn | Fn—1) = Haai (P-a.s.), and it follows 
from (1) that Hn-1Zn-1 = E(HnZn | Fn—1) (P-as.). 

This equality holds not only P-a.s., but also P-a.s., for its both sides vanish 
on the set {Z,—1 = 0} because we also have Z,, = 0 (P-a.s.) in this set. As 
regards the set {Z,-1 > 0}, the measures P and P are equivalent there (in the 
sense of the equivalence P({Zn-1 > 0} A A) = 0 > P({Zp-1 > 0} NA) =O 
for A € Fn-1), therefore the left-hand and the right-hand sides of this equality 
coincide also P-almost surely. Thus, we have proved the implication ==> in (2). 

In a similar way, if HZ € (P), then Hn-1Zn-1 = E(HnZn | Fn-1) 
(P and P-a.s.). Since Z,-1 > 0 P-a.s., it follows that 


1 


Ay-1 = Fog 
wa 


E(HnZn | Fn—-1) (P-a.s.). 


Hence it follows by (1) that H € M(P). 
It is worth noting that the Bayes’s formula (4) in §3a (and, therefore, also 
formula (1)) is a consequence of the implication => in (2). For let Y be a ¥,-mea- 


=~ ~l 
surable variable such that E|Y| < co and P < P. Then we consider the martingale 
(Hm, Fm, P)ms<n, where Hm = E(Y | Fm). By (2) we obtain 


E(Y Zn | Fm) = HmZm (P-a.s.). 


In particular, E(Y Zn | Fn-1) = Hn-1Zn-1 (P and P-a.s.). 
Since P(Zn—1 > 0) = 1, it therefore follows that 


1 nee 
gz EY Zn | Fn-1) =. Ay-1 (P-a.s.). 
n—-1 


Together with the equality Hn-1 = E(Y | F¥n-1) (P-a.s.), this proves the required 
formula (4) in the ‘conversion lemma’ in § 3a: 


il = 1 oe 
P<P and E(Y|¥y-1) = ZEW Zn | Fn) (P-as.). 
n—l1 


Hence the property (2) is a sort of a ‘martingale’ version of the conversion lemma 
for absolutely continuous changes of measures. 
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2. It can be useful to formulate (2), together with its ‘local’ version, as the following 
result (cf. [250; Chapter III, § 3b]). 


~l 
Lemma. Assume that P < P and let Z = (Zn) be the density process: 


with Pn = P | Fn and Pn = P| Fp. 
Let H = (Hn, Fn) be a stochastic sequence. Then 


a) Hisa P-martingale (He MPY) ifand only if the sequence HZ = (Hn Zn, Fn) 
is a P-martingale (HZ € M(P)): 


H € MP) <> HZ € M(P). 


b) If, in addition, pe P, then H is a local P-martingale (He Mage (P)) if and 
only if HZ is a local P-martingale (HZ € Mioc(P)): 


H € Myo (P) <> HZ € Mp,(P). (3) 


Proof. a) We have already proved this using formula (1) (which is also of interest 
on its own right). However, this can also be proved directly, starting from the 
definition of a martingale. 

We choose m < n and A € Fm. Then E(L4Hn ) = E(I4 Hn Zn), so that 


E(L4Hn) = E(I4Hm ) = E(I4HnZn) = EU gH mZn). 


However, E(14HmZn) = E(14HmZm). Hence H € M(P) <> HZ € M(P). 
b) We claim that HZ € Mioc(P) => H € Mpo-(P) (even if we assume only 
~ | 
that P < P). 
Let (Tn) be a localizing sequence for HZ and let T = lim Tmn. Then P(r = 00) = 1 
aS ~ | 
and P(r < œ) = EZ, I(r < œ) = 0 (since P < Pand by property d) in the theorem 
proved in §3a). Hence P(t = œ) = 1. 
We note that 


(H™ Z), = Hy" Zk = (Hp Zk)” + Hy, (Zk — Zr,1(k > Tn)). 


Consequently, H™ Z is aP- martingale ; and H™ € M(P ) by assertion a). However, 
P(limt, = œ) = 1. Hence H € Moc (P). 


We claim now, that, conversely, if PIS P, then HE Mage (P) => HZ€M,,(P). 
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Let (on) be a localizing sequence for H € Moe (P). Then P(limon = oo) = 1 
and H°”? € M(P). Hence H?” Z € M(P) by a), and since 


(AZ) = Hg" Zp = Hon (Zk — Zon I (k > Tn)) 


(cf. the above formula for (H™Z),), it follows that (HZ)%” € .(P). However, 


loc 
P <P, so that P(limo, = œ) = 1. Hence HZ € Migc(P). 
The proof is complete. 


3. Properties (1), (2), and (3) are of fundamental importance for the problem of the 
verification of the martingale property with respect to one or another measure P 
for the sequences (Hn), (An); (Sn), etc., because they reduce this task to the 
verification of the martingale property with respect to the original, basic measure P 
for the sequences (Hn Zn), (Hn Zn), (SnZn). 

In fact, we have already used this in our proof of the discrete analog of Girsanov’s 
theorem in the conditionally Gaussian case, when Hn = hy +++- + hn and hy = 
ln + OnEn, n S N, and the Py are the measures with densities 


X His LHe? 
Zn = exp} - eG ie (4) 


explicitly constructed from (ug) and (ox). = 

However, the general case of the construction of the measures P y is considerably 
more complicated. The simplicity of the formulas for the Zy in the conditionally 
Gaussian case is essentially the effect of the simple formulas for the Hp: AH, = 
hn = Hn + OnEn- 

There exist various generalizations of Girsanov’s theorem in the discrete-time 
case. 

For a more clear perception of the results below as some generalizations of this 
theorem it seems reasonable to reformulate its already established version (i.e., the 
theorem in § 3b) in the conditionally Gaussian case. 


We set Z 
Qn = z (Zn > 0). (5) 
n—1 
Then P 
= Pi 2 Dia 
maofa He) o 
n 
and if Mn = 3° ope,, then M = (Mn) € Mioc(P) and it is easy to show that 


k=1 


E(anAMn | Fn—1) = —Hn- (7) 
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Thus, leaving aside the questions of integrability for a while, in the conditionally 
Gaussian case we can state the theorem in § 3b as follows: 


M E M,-(P) => E(onen| Fn-1) = 0, nN 
<=> Elfin + 0nEn| Fn-1) = bn, nN 
==> E(hn|Fn-1) = bn, 1S N 
==> E(hn|¥n-1) =0, n<Nn 
<> E(AM, + in| Fn-1) = 0, n<Nn 
<=> E(AMy — E(anAMn | Fn-1)|Fn-1) = 9, nN. 


In other words, the inclusion M € -M@o-(P) means that the sequence M = 
(Mn)n<N: where 


Mn = Mp — 5 E(a, AM, | Fk-1), (8) 
k=1 


is a local martingale with respect to the measure P(dw) = Zy (w) P(dw): 
M€ Mioc(P) = Me Mage (P). (9) 


We emphasize an important point in the above analysis. In our discussion of 
the sequence H = (Hn) with AH, = un + AM, we were primarily interested in 
the martingale component of H. In effect, we ‘traced’ the change of the martingale 
component under a continuous change of measure. As we see, if we consider the 
measure P, then (Mn) is no longer a martingale: it has the representation 


n 
Mn = » E(ak AM, | Fk-1) +Mn, 
k=1 


where M = (Mn) is a P-martingale and A = (An = P} Ela AM, | Fg-1)) is 
some predictable drift. It is the appearance of this additional ‘drift’ term after an 
absolutely continuous change of measure that enables one to ‘kill’ the drift compo- 
nents of the original sequences H = (Hn) by means of a transition to measures P 
that are absolutely continuous (or locally absolutely continuous) with respect to P. 


4. This interpretation of the above discrete version of Girsanov’s theorem (in the 
conditionally Gaussian case) enables us to state the following general result on local 
n 
martingales, where we do not specify that Mn = ` oper. 
k=1 
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vi dP 
THEOREM 1. Let M € Mioe(P) with Mo = 0. Assume that P < P, let Zn = P 
n 
n > 1, be the corresponding densities, and let an = 7 I(Zp—-1 > 0), where 
n-1 
Zo = 1. Also, let 
E(|JAMnlan|Fn—1) < (P-a.s.), n>. (10) 


Then the process M = (Mn) defined in (8) belongs to Mage (P) (ie, is a local 
P-martingale). 


Proof. Again (as in the proof for the conditionally Gaussian case in § 3b), we use 
Bayes’s formula (4) in § 3a: 


E(Mn | Fn- 1) =E( Mnan| Fn— 1) 
= E(an(Mn — Mr- 1)| Fn- 1) + E (anM, n~ 1|Fn—-1) 
= E(an AM, | Fn-1) + Mn-1- (11) 


Hence, by assumption (10), we obtain 


E(IMn [| Fn- 1) S E(lanAMn| | Fn—1 ) + |Mn—1| < œ 


(P and P-a.s.) es 
From (11) and (8) we immediately see that E(|Mn||¥%n-1) < œ and 


E(Mn | Fn-1) = Ma~, (12) 
Le., M is a local P-martingale (M € Maye (P)) by the theorem in Chapter II, § 1¢.1. 
5. Now, assume that the basic sequence H = (Hn)n>1 has the representation 

Hn = An + Mn, (13) 
where A = (An)n>1 is a predictable sequence (the An are Fn-ı-measurable for 
n> 1, where Fo = {Ø, Q}, and Ao = 0) and M = (Mn )nsi € Moc(P)- 
Since E((AM,|| Fn-1) < œ for a local martingale, it follows that 
E(\AHn|| Fn-1) < |[AAn| + E(AMnl | Fn—1) < 00, 

so that the Hn, n > 1, have also the representations 


= ST E(AHg| Fe) + Do [AH - EAMG | Fu-1)]; (14) 
k=1 k=1 
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which we have called the generalized Doob decomposition (see Chapter II, §1b) of 
the sequence H = (Hn) n>1- 

As in the case of the usual Doob decomposition, the representation of the 
form (13) with predictable (An) is unique, and therefore 


An = X E(AH; | Fk-1) (15) 
k=1 
and 
= 5° (AM, - E(AH; | Fx-1)). (16) 
k=1 


Theorem 1 above has the following simple generalization. 
THEOREM 2. Let H = (Hn)n>1 be a sequence with generalized Doob decomposi- 


~ Se 
tion (14) and assume that (10) holds. Let P be a measure such that P < P. Then 
the sequence H = (Hn)n>1 has a representation 


Hn = Àn + Mn, (17) 
or, equivalently, 
n ae nr 
Hn = X E(AH,| Fk) po. [AH - E(AH; | Fk-1)] (18) 
k=1 k=1 
(a generalized Doob decomposition), where 
F? n 
An = An + J E(ak AM; | Fp-1) (19) 
k=1 
and the sequence M of the variables 
n n 
Mn = Mn — > E(ap AH; | Fp); (20) 
k=1 


is a local P-martingale (M € Moc (P)). 


Proof. This is an immediate consequence of Theorem 1 applied to the sequence 
M = (Mn)ng1 with Mn = Hn — An- 
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6. Under the hypothesis of Theorem 2 we assume now that M and Z are (locally) 
square integrable martingales. Then they have a well-defined predictable quadratic 
covariance (M,Z) = ((M, Z)n)n>0, where 


a= 5 E(AM,AZ; | Fp1). 
k 


( (21) 


’ 


One sometimes calls (M,Z) simply the angular brackets’ of M and Z; as regards 
the corresponding definitions in the continuous-time case, see, e.g., [250; Chap- 
ter II, § § 4a, b] or [439; Chapter VII, § 1] and Chapter II, § 5b. We also recall that 
the quadratic covariance of the sequences X = (Xn)n>0 and Y = (Yn)nzo is the 
sequence [X,Y] of variables 


[X,Y = D AM AY, (22) 


k=1 


By (21) and (22) we obtain that, in the case of (locally) square integrable martin- 


gales, the difference [M, Z] 


We shall now assume that 


-- (M, Z) isa local martingale (see [250; Chapter I, § 4e]). 
P'S P, Then Zn>0 (P and P-a.s.) and 


A(M, Zòn E AMn AZn | Fn-1] 
= =E LAM, | Fn- 
Zn-1 Zn-1 [æn ) l & 1] 
= ElanAMn | Fn-1]- (23) 
Note that if (M),(w) = 0 where (M) (= (M,M)) is the quadratic characteristic, 
then also (M,Z),(w) = 0. Hence the left-hand side of (23) can be rewritten as 
follows (here (Mòn = (M,M)n) 
A(M 
a A ees A (24) 
n-1 
where 
an A(M, Z)n 
$ A(M)nZn-1 
A(M,Z)n 
and where we set —~—~-—— to be equal, say, to 1 if A =0. 
AM, q y (M)n 
Thus, (19) can be written as follows: 
An = An — 2 apA(M (25) 
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Hence we can make an important conclusion as regards the structure of the original 
sequence H (with respect to P): if this sequence is a local martingale with respect 


to a measure P'S P (i.e, A=0), then 


n 
Hn =) apA(M)e+Mn, nèl, (26) 
k=1 


or, in terms of increments, 
AH, = an A(M) yn + AMn, n>l. (27) 


7. So far we have based our arguments on the mere existence of the measure P, 
without specifying its structure or the structure of the sequence a = (a) involved 
in the definition of the Radon—Nikodym derivatives 


n 


dPn 
= Qk, nèl. 28 
dPn H k ea 


By (23) and (24) we obtain 
anA(M)n = E[(1—an)AMn| Fn-1]. (29) 


This relation can be regarded as an equation with respect to the (Fn-measurable) 
variable an, and one can see that it has the following (not necessarily unique) 
solution: 

Qn =1—-—an,AMy. (30) 


Of course, only those solutions a, satisfying the relation P(an > 0) = 1, n > 1, are 
suitable for our aims. If this holds, then 


—"=]ja- AM,,) = ¢ -> AM, 1 
dPp ii! ak k) e( cs ak :) - (31) 


where & = (E(R)n) is the stochastic exponential (see Chapter II, § 1): 
E(R)n =e T] (1+ ARg)e ^ = TY (1 + AR). (32) 


ksn kín 


Let P be a probability measure such that its restrictions Pn = P | Fn can be 
recovered by formulas (31). Our original sequence H = (Hp), which satisfies (27), 
becomes a local martingale with respect to this measure because AAn = a, A(M)n4+ 
E(a, AM, | Fn-1) = 0 for n > 1 and Ao = 0. 

We have already pointed out that this probability measure P, a martingale 
measure, iS not unique in general. However, it has certain advantages. First, 
it can be explicitly constructed frorn the coefficients an. Second, it has certain 
properties of ‘minimality’, which justify its name of minimal measure (see [429] 
and Chapter VI, § 3d.6). 
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§ 3e. Integer-Valued Random Measures and Their Compensators. 
Transformation of Compensators 
under Absolutely Continuous Changes of Measures. 
‘Stochastic Integrals’ 


1. Let H = (Hn)nzo be a stochastic sequence of random variables Hy = Hy (w) 
defined on a filtered probability space (Q, ¥,(¥n)nz0,P). We shall assume that 
Ho = 0 and Fo = {9,Q}. 

There exist two ways to describe the probability distribution Law(#) of H: in 
terms of the unconditional distributions of the variables H1, H2,..., Hn, 


Law(H1, Ho,..., Hn) (1) 


(or, equivalently, Law(A H1, AH2,...,AH,)) or in terms of the (regular) condi- 
tional distributions of the variables A Hp: 


P(AHn E€ -| Fn-1), n>l. (2) 


The second way has certain advantages because, given conditional distribu- 
tions, one can, of course, find the unconditional ones. (In addition, the condi- 
tional distributions exhibit the dependence of the AH, on the ‘past’ in a more ex- 
plicit form.) On the other hand, starting from unconditional distributions (1) one 
can recover only the conditional distributions P(AHn € -| F#_,), where FH | = 
o(w: Hı,..., Hn-1), so that gus C Fn-1 (this can also be a proper inclusion). 
2. We consider a d-dimensional stochastic sequence 

X= (Xn, Fn)nz0 (3) 


ona filtered probability space (Q, F, (Fn)nz0, P). Let Xo = 0 and let Fo = {Ø, Q}. 
We associate with X the sequence u = (Jin(-))n>1 of integer-valued random 
measures defined as follows: 


lin(Asw) =I4(AXp(w)), Ae B(R4), 


(4) { 1 if AX;,(w) € A, 
a 
re 0 if AX;(w) g A. 
Farther, let v = (vj(-))n>1 be the sequence of regular conditional distributions 
Vn(: ) of the variables AX,, with respect to the algebras ¥,_1, i.e., of the functions 
Vn(A;w) (defined for A € B(R*) and w € Q) such that 

1) v_(-;w) is a probability measure on (R4, @(R®)) for each w € Q; 

2) vn(A; w), regarded as a function of w for fixed A € B(R®), is some realization 

of the conditional probability P(AX, E€ A| Fn-1)(w), i.e., 


vn(A;w) = P(AXn € A|Fn—1)(w) (P-a.s.). 
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(The proof of the erastence of such a realization of the conditional probability can 
be found, e.g., in [439; Chapter II, § 7].) 
For regular conditional distributions the conditional expectations 


E[f(AXn) | Fn-1](¥) 


can be calculated for a non-negative or bounded function f by means of integration 
over the regular conditional distributions v,(-) for fixed w: 


E[F(AXn)|Fna](w) = fsa) valdu) (Pras) 
Thus, in the case under consideration we have 
Yn (A; +) = Eltin(Asw) | Fn-a](-), 
so that for each A € B(RÎ) the sequence 
(H(A) — Yn(A)) ns 
with un(A) = fn(A;w) and vn(A) = vn(A;w) is a martingale difference with 


respect to the measure P and the flow (Fn). 
We set 


L(9,n](A5 w) = ss uk(A;w) and Y(0,n] (As w) = 5 vk(A; w). 
k=1 


Then, clearly, for each A € B(RÎ) the sequence 
(H(0,n] (AS w) = %o,n}(As w))az1 


is a martingale. This property explains why one calls the (random) measure 
Y0,n](*) the compensator of the (random) measure ji(g,,](+) and why one calls 
the sequeuce 

u-V= (H(0,n)( > a 0,n]( ` Vapi 


a random martingale measure. 
Note that the representation 


p=v+ (u-v) 
of the measure u = (HOn) n21 with predictable measure v = (ViO,n))nz1 is a kind 


of the Doob decomposition (Chapter II, § 1b) into the sum of a predictable and a 
martingale components. 


3. Construction of Martingale Measures 461 


Remark. One can also associate with the sequence X = (Xn, Fn)n>o the integer- 
valued random measures of jumps p* = (H65 ny -))nz1 where 


Kon (Aw) = J uk (Aso) 


and 


uj (A; w) = I(AX}(w) € A, AX; (w) £ 0). 


Clearly, if A € B(Rİ\ {0}), then pn(A;w) = už (4;w). All distinctions between 
these measures are concentrated at the ‘no-jump’ events {w: AXn(w) = 0}. 
P{w: AX,(w) = 0} = 0, then the measures u and u* are essentially the same. 

We note that these are the random jump measures j*, rather than p, that play 
the central role in the continuous-time case, in the description of the properties 
of the jump components of stochastic processes in terms of integer-valued random 
measures. (See Chapter VII, § 3a of this book and (250; Chapter II, 1.16] for greater 
detail.) 


3. In this subsection we consider the ‘stochastic integrals’ 
w * p, w * Y, w x(u- v) 


with respect to the just introduced random measures pu, v, and p — v. 
Let w = (wp (w, £))ķ>1 be a sequence of F © B(RÎ)-measurable functions. We 
denote by w »* pu the (dependent on w) sequence of the sums of the Stieltjes integrals 


(w * Hl D5 f ws 2) up (dz; w). 


Our integer-valued random measures py are very special in that they take only two 
values, 0 and 1. Hence 


a wr (w, £) py (dx; w) = wp (w; AX, (w)). 


Consequently, we have in fact 


n 


(w * UW) n(w) = Date (w; AX; (w (w)). 


In a similar way we can define the ‘stochastic’ integrals w x v and w * (u — v) 
with respect to the measures v and p — v in terms of Stieltjes integrals. Note that 
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we must impose the following condition of integrability on the wg(w, x) for the 
existence of these ‘integrals’: 


i; lug (w, 2)| v4 (der; w) < 00 
Rd 


for all (or almost all) w € Q and k > 1. 
It is easy to see in this case that 


we(u-v)=wep—wee. 


(We must warn the reader against extending this property automatically to the case 
of general integer-valued random measures, e.g., the jump measures of stochastic 
processes with continuous time: the integral w * (u — v) may be well defined while 
w xp and w *v are equal to +20, so that their difference has no definite value; 
see (250; Chapter III] for greater detail.) 

Assuming additionally that the functions w;(w, x) are #,_ -measurable for each 
z € R¢ we obtain the predictable (i.e., ¥,—1-measurable) integrals (w * v)n. 

If, moreover, 


Ef, jwp (w, x)| vk(dz; w) < œ (4) 
R 


for each k > 1, then it is easy to see that the sequence w*(p-v) = (w*(u—v)n)nz1 
is a martingale. 
Replacing (4) by the conditions 


Ei |Wkarn (W, T)| Vkar, (dz; w) < œ, k 2 1, n 2 1, (4°) 
R 


where (Tn) is some localizing sequence of Markov times (Tn < Tn+1, Tn T œ), we 
obtain a sequence w * (p — v) that is a local martingale. 


4. We now consider the Doob decomposition of the sequence H = (Hn)n>1, where 
we assume that Elhn| < œ for hn = AHn, n > 1. We have (see Chapter II, § 1b) 


Ay = An F Mn, (5) 
where 
An = EE Few) (6) 
kín 
and 
Mn = J [he -— Elh; | Fe_1)]- (7) 


kín 
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We can represent the variables An and Mp as follows in terms of the measures 
H = (Un)n>1 introduced above and their compensators v = (¥_)n>1: 


An = E | rrlino), (8) 


Kn 


| a fs (un (da; w) — v4 (dx; w)). (9) 


kín 


For brevity, one denotes the right-hand sides of (8) and (9) by 


(2*V)n (10) 
and 

(zx (u —=v))n (11) 

respectively (see [250; Chapter II]). 

Thus, 

Hy = (£ *v)n + (£ *(u-v)),, (12) 

or, in the coordinate-free notation, 
H=xrxv+a2%*(p—v). (13) 


Of course, H = x * p in our case, so that (13) is in fact the equality 
EeL=TeV+T*(p—V), 


as obvious as is the Doob decomposition under the assumption E|Rp| < œ, n > 1. 
In place of the condition that E|h,| < œ for n > 1 we shall now assume that 

(P-a.s.) 
E(|Rn|| Fn—1) < œ, n>. (14) 


Under this assumption, the sequences A = (An) and M = (Mn) are certainly well 
defined (by formulas (6) and (7)), and M is a local martingale since 


E(|AM,,| | Kaci) <œ and E(AM,, | Fn-1) = 0. 


Hence, if (14) holds, then the sequence H = (Hp) has a generalized Doob de- 
composition 


H=A4M (15) 


where A = (An) and M = (M,,) are as in (6) and (7). 
Moreover, A is a predictable sequence and M is a local martingale. The repre- 
sentation (15) can be rewritten in terms of p and v, as equality (13). 
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Remark. We recall (see Chapter II, §1b) that E(hn | Fn-1) in (6) and (7) is the 
generalized conditional expectation defined as the difference 


E(hn | Fn—1) — Elha | Fn-1) 


on the set {w: E(|hn||-Fn—1) < oo} and in an arbitrary manner (e.g., as zero) on 
{w: E(|Rn| | Fn-1) = co}. 

In the general case where (14) can fail, one can obtain an analog of (15) or (13) 
as follows (we already discussed that in Chapter II, § 1b). 

Let p = v(x) be a bounded truncation function, i.e., a function with compact 
support that is equal to x in a neighborhood of the origin. A typical example is the 
‘standard truncation function’ 


(x)= «I (|x| < 1). (16) 


Then 


k=1 k=l k=1 
= J. Ele(re) | Fr] 
k=1 
+ JO [v(he) — E(p(hr) | Fe-1) r2 E 
k=1 k=1 
= [eo V_(dx) + Fa ) (uy (dx) — vp(da)) 
k=1 k=1 
+ z — v(x)) uk(dr). (17) 
D p(£)) Hk 


Using the notation of (12) and (13), we obtain the following representation: 


H, = (p(x) * v), T (p(z) x(u- v)) , + (x = y(x)) * Ty (18) 


or, in the coordinate-free notation, 


H=px«vt+ox*(p—v) + (w—) * p. (19) 
DEFINITION. We call (18) and (19) the canonical representations of the sequence 
H = (Hn)n>0, Ho = 0, with truncation function p = p(x). 


With an eye to the continuous-time case it is useful to compare this definition 
and the canonical representation of semimartingales in [250; Chapter II, § 2c] and 
in Chapter VII, § 3a of the present book. 
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5. Let H = (Hn)nz1 be a sequence with generalized Doob decomposition 
Ay = An + Mn 


and assume that condition (10) in §3d is satisfied. Then by Theorem 2 in the 


same § 3d we obtain the following representation with respect to an arbitrary mea- 
~ loc 


sure P < P: 


n r 
Hoz an + X E(a, AM, |Fi.-1)| + [mn — X E(apAM; | Fa~1)| 
k=1 k=1 


= Àn + Mn, (20) 


where M € Mioc(P). 
We now write down the canonical representations of H with respect to the 
measures P and P: 


H=pxvt+yx(p-v)+(2-y) ep (with respect to P) (21) 


and 
H=prvige(u-vy+(r-y)*p (with respect to P), (22) 


where u is the jump measure of the sequence H. 

It is important for the stochastic calculus based on the canonical representa- 
tions (21) and (22) that one knows how to calculate the compensators 7 for given 
compensators v and the characteristics of the density process Z = (Zn). In partic- 
ular, we are interested in formulas describing the transformation of the ‘drift’ terms 
pry and p*v under a change of measure. 


We shall discuss this issue more closely under the assumption that P Z P. Let 
Uy(-3w) = Pihn €-| Fn-1) (w) 
and 
Dn(-;w) = P(Rn € -| Fn—1)(w) 


be the regular modifications of the corresponding conditional probabilities. 
Bayes’s formula (4) in §3a assumes the following form for Y = I4(ħn), 
A€ B(R \ {0}), and m=n-1: 


Eal) | Fa) = E( alha) én | Fam). (23) 
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Hence the following conjecture looks plausible: for each w € Q the conditional 
distributions V,(-;w) are absolutely continuous with respect to vp(-;w), i.e., there 
exists a A(R \ {0})-measurable (for each w € Q) function Y, = Yn(z, w) such that 


Dp (A; w) = f Yale) valde). (24) 
If it were so, then 
AGE eae ee 
Tai (e) = Ya(s,u), (25) 


i.e., the function Y;,(2,w) would play the role of the density of one measure (more 
precisely, of one conditional distribution) with respect to the other. 


We present now the proof of formula (24) (under the assumption P p P), which 
simultaneously delivers an ‘explicit formula’ for the density Yn = Yn(z,w), n > 1. 

We consider the conditional expectation on the right-hand side of (23). By 
definition, for each B € ¥,_1 we have 


f E a | Fn-1) () PFa) 
= f talline 1 Fn) 


(w) 
j Al A ey minw) (P| Fn-1)(dw) 
= foes ZED ten dai 0)(P | Fn—1) (A. (26) 


Let Mn(dz, dw) be the ‘skew product of measures’ 
Ln (dx; w)(P | Fp~1) (dw) 
on B(R\{0})@Fpy_1 and let E m, (- | B(R\{0})@Fy-1) be the conditional expecta- 
tion (with respect to the algebra B(R\ {0} 9 Fn-1 and measure Mn = My (dz, dw)) 
defined in a standard way (see, e.g., (439; Chapter II, §7]), on the basis of the 
Radon-Nikodym theorem. 
Then we derive from (26) by Fubini’s theorem that 


E (Lahn) z | Fama J) PI Fn) (do) 
B 


e3 Zn(w) y 
= fa Zn- (w oa gi 


-faló 


Moa TO 1) (2.4) an drs 2) | (P| Faaa 1) 
| | 


ee i), w) Mn (dz; dw) 


1) (2.0) va P| Fn_1) (dw). 
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Since B is an arbitrary subset in Fp—1, it follows that (P-a.s.) 


E (Tallin) | Fur )(w) = f Yale) raldaiw, (27) 
where 
Yale, u) = Em, ( 7 | BOR (0}) © Fut) (244. (28) 


Comparing (23) with E(I4 (hn) | Fn-1) = fa Pn(dr;w) and formulas (27) 
and (28) we see that ¥,(-;w) < vp(-;w) for each w € Q and (25) holds. 

This formula provides an answer to the above question: the ‘drift’ terms p * D 
and p xv in (22) and (21) are connected by the relation 


pr =prxv+(Y —l)*ev (29) 


(at any rate, if (|p(x)(Y — 1)|*v)n < co, n > 1). 


§ 3f. ‘Predictable’ Criteria of Arbitrage-Free (B,S)-Markets 


1. By the First fundamental asset pricing theorem (§ 2b) a (B.S)-market formed 
by a bank account B = (Bn) and d assets S = (SŁ, E A (SŁ), O<n<QN, 
is arbitrage-free if and only if there exists a probability (martingale) measure P 
equivalent to P on the initial filtered probability space (Q, F, (Fn)nən,P) such 
that the d-dimensional sequence of discounted prices 


mes (#) 
B Bn O<n<Nn 
isa P-martingale. 


The description of the class Y(P) of all such martingale measures P ~ P is also 
very interesting because we have already seen in § 1c that in looking for upper and 
lower prices one must consider the greatest and the least values over the class Y(P). 

Searching such measures it is reasonable to start with a slightly more general 
problem of the construction of martingale measures P, that are locally absolutely 
continuous with respect to P, leaving aside the question whether a measure so 
obtained satisfies incidentally the relation P ~ P. 


2. In the previous sections we exposed all the material pertaining to ‘absolutely 


continuous changes of measure’ that is necessary for a discussion of the problem of 


~ lo 
the construction of measures P < P. 


We start with the case where d = 1, Bn = 1, S = (Sn) is the only ‘risk’ asset, 
and 


Sn = Soe. (1) 
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We assuine that we have a filtered probability space (Q, F, (Fn), P) and that the 
variables H,, are ¥,-measurable. As before, we shall set hn = AH, and Ho = 0. 
Throughout, we shall assume that Fo = {9, Q}. 

Using the notation and the results of § 3c, for n > 1 we set 


Hy = >) (e^ —1) (2) 
k=1 
and 
ElÊÂ)n = [| 0 +42); (3) 
k=1 
we also set Ho = 0. 
Note that E 
AS(H)n = E(H)n-14 Hn. (4) 


~ | 
In our construction of a inartingale measure P < P such that $ = (Sn) is a 
P-martingale we can follow directly the above-described pattern: write down the 
canonical representation for S = (Sn) (of type (13) in § 3e) and then find a measure 


P S P ‘killing’ the drift term. We could proceed in that way; however, we have 
an additional property: prices are positive. This property enables us to consider 
the logarithms of the prices (i.e., the sequence H = (H,,)), which, as statistical 
analysis shows, have a more simple structure that the sequence S = (Sn) of prices 
themselves. 


~ ~ | 
Let P be a measure such that P <P. 
i We set Zn = aP,,’ 
Po = Po, so that Zo = 1. Let X = (Xn)nz0 be some sequence of ¥,-measurable 
variables Xn, and let Xo = 0. 

By the lemma in § 3d, 


where Pn = P | Fn and Pn = P| Fn. In what follows we set 


~ loc 
if P< P, then 
X € M(P) <> XZE MP), (5) 


and 


if P'S P, then 


X € Mag. (P) <=> XZ € Mige(P). (6) 
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Now let X € Mig(P). By the theorem in Chapter II, §1lc the sequence 
X is a martingale transform, and therefore AX, = anAMn, where the an are 
Fna-ı-measurable and M is a martingale (with respect to P). Hence 


AE(X)n = 6(X)n—-1AXn = an€(X)n—14Mn, 


and therefore (X) is also a martingale transform. so that (again. by the theorem 
in Chapter II, §1c) €(X) € Mioc(P). Thus, 


X € AMioe(P) => 6(X) € Moc (P)- (7) 
Assuming that &(.X) 4 0 and considering 
AXn = A6(X)n/EO) a1, 
we can see in a similar way that 
E(X) € Mig (P) => X € Moc (P). 


Hence, in view of (6). we obtain the following result. 


loc 


LEMMA 1. Let P'* P and assume that E(X) # 0 (P- a.s.). Then 
E(X) € Moe(P) => X € MioelP) —= XZ EM,,(P). 


Assume now that 6(.Y) > 0 and 6(.X) € (P). Then, by the lemma in Chap- 
ter II. § lc we obtain that 6(.X) € .4(P). Ganucueuily: in view of (5), we have the 
following result. 


Lemma 2. Let P & P and assume that &(X) > 0 (P-as.). Then 
E(X) € Ma (P) => E(X) € M(P) => 6(X)Z€.M(P). 


We point out the following implications (which must be interpreted component- 
wise) holding by (3): 
AX 4-1 <= &(X) #0 
AY >-1 = 6&(X)>0, 
and 


AX >-1 4> &(X)>0. 


Applying Lemmas 1 and 2 to the case of X = Í. where H is related to H by 
formula (2), we obtain the following result. 
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LEMMA 3. Let S = (Sn)nz0, where 


Sn = Soe", Ho = 0, (8) 
and let AH, = eAln _], Ho = 0. 
Then "x 
Sn = So6(H)n (9) 
and 
~x loc 
if P < P, then 
Se MP) = &(A)Z E€ MP); (10) 
if P'S P, then 
SEMo(P) = HZ € My-(P). (11) 


3. These implications indicate the way in which one can seek measures P such that 
SE M(P). 

The sequence Z = (Zn) is a P-martingale. In accordance with (10) and (11), we 
must describe the non-negative P-martingales Z such that EZ,, = 1 and, in addition, 


~ 


S os 
either 6(H)Z € M(P) if we are looking for a measure P <P orH ZEMioc(P)if we 


5 loc 


require that P ~ P. 
The corresponding class of martingales Z = (Zn) in the conditionally Gaussian 
case is formed by the martingales 


n n 
1 
Zn = expt > bkEk — 3 > a (12) 
k=1 k=l 


(with Fn-1-measurable bp; see, for instance, formula (7) in §3b) satisfying the 
difference equations 
AZn = Zn—-1ANn, (13) 


where the variables T 
ANp = ebrEnT3n — 1, (14) 


make up a generalized martingale difference. i.e.. 
E((ANn||Fn-1) <x and E(AN,| ¥y-1) = 0. 


Hence a natural way to look for the density processes Z = (Zn) can be as follows. 
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We shall seek the required densities Zn (necessary for the construction of the 
measures Pn by P,,(dw) = Z,(w) Pn (dw)) in the form (13), i.e., we assume that 


Zn = E(N)n, Zo = 1; (15) 


where N are some (specified in what follows) local martingales with No = 0, 
AN, 2 —1, and E€(N), = 1. 


The question on the volume of the class of so obtained measures P pa P is far 
from simple. The point is that, first, it is not an easy problem to define whether (for 
a family of consistent ‘finite-dimensional’ distributions {P,,}) there exists a measure 
P such that P Fa P,,n>1. (See a counterexample in [439; Chapter I, § 3].) 

Second, the question of the structure of all the martingales (or local martingales) 
on (Q, F, (Fn)nz1,P) is not simple in principle. (See [250; Chapter III} on this 
issue.) 

In what follows, we take a way that, while giving us no exhaustive answer to the 
question of the structure of all measures P such that P s PorP lot P, is nevertheless 
fairly simple technically and brings one to a broad class of such measures. First of 
all, we make several observations. és 

It is clear from the above that, given a measure P, all measures P such that 


P'S P are completely (as regards their finite-dimensional distributions {Pn }) 


described by their densities Z = (Zn). It follows from the assumption PIS P that 
P(Zn > 0) = 1, so that we can construct from Z = (Zn) a new sequence N = (Nn) 
with No = 0 and 
AZn 
AN, = >—.. 16 
ý Zn-1 ( ) 
Clearly, N € Mioc(P), and the sequences Z and N are in a one-to-one correspon- 
dence thanks to the equality Zn = €(N)n. 
Hence we can turn in our construction of measures P'S P from the densities 
dP 
Z = (Zn), where Zn = re to the appropriate sequence N = (Nn), which must 
satisfy the inequalities AN, > ~1 to ensure that Zn > 0 (P-a.s.) for n > 1. 
Thus, we shall assume that Sp > 0 , Sn = So@(H), for n > 1, and, in addition, 
Sn > 0 (which is equivalent to the relation AH, > —1). 
Also let Z = (Zp), where Zn = 6(N)p with AN, > —1, so that Zn >0 (P-a.s.). 


Assume that there exists a measure P such that its restrictions Pp = P | Fn 


satisfy the relations Ph ~ Pn, n È 1, ie., pig P; 


Then, in view of (10), 


Se M(P) < &(H)E(N) € M(P). (17) 
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4. The following Yor’s formula (M. Yor; see, e.g.. [402]) can be immediately veri- 
fied: 
6(H)E(N) = €(H+N+[H,N)), (18) 


where 


[H.NIn = X AH, ANg. 
k=1 


By assumptions elh) > 0, E(N) > 0, and equivalence (17) we obtain 
SEMP) <=> &(H+N+4+[H,N]) E€ MP) 
<= H+N+[(H.N] € MioclP). 
Hence if N € Mloc(P). AN > -1, Zn = &(N)n. and dPn = Zn dP, then 
Se MP) == +Ê. N] € Mioe(P). 
Since A(H + [H,N]) = AH(1+ AN), the inclusion A + [Ĥ, N] € Moc(P) is 
equivalent to the condition that the sequence AH(1 + AN) = (AH, (1+ AN,)) 


is a local P-martingale difference, or. the same (see the lemma in Chapter II. § lc 
that it is a generalized P-martingale difference and satisfies (P-a.s.) the relations 


n 


1 


E[|AAn(1 + ANn)| | ¥n—1] <œ (19) 
and . 
ELAn + ANn)|Fn—1] = 0 (20) 


for all n > 1. 

We note that conditions (19} and (20) are formulated in terms of the conditional 
expectation E(-|.4,_ 1) (i.e. in ‘predictable’ terms). 

Conditions (19) and (20) can be expressed in various forms. For instance. bear- 
ing in mind that AZ, = Zp-1ANn. ANn > 1. and AA, = e4#” — 1. we see 
that (20) is equivalent to the following condition on Z: 


Zn 
ean Fe - | Fn- | =1, (21) 
~ | 
Of course. we could derive this condition directly because if P L P. then 
Elfen Fri! — Ee AHy An fa | Fn- 1 (P-a.s.) (22) 
Zn- 1 


by Bayes’s formula ((4) in §3a or (1} in § 3d). and therefore (20) is equivalent 
to the relation E(Sn |-Fn-1) = Sn-1 (P and P-a.s.). In a similar way. (19) <= 
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E(Sn |Fn-1) < co. Since the Sn are non-negative, conditions (19) and (20) ensure 
that the sequence S = (Sn) is a martingale with respect to the measure P constructed 
from the sequence N = (Np). 

The above method can easily be extended to the multidimensional case, when 
S =(S!,...,S%) and we consider the question of whether 


ž € M(P) 


with respect to some measure P oe P. 
This is the question we discuss next. 


5. Our choice of the bank account B = (B,) as the normalizing factor has the ad- 
vantage that the variables Bn are ¥,,1-measurable, which, as already mentioned, 
brings certain technical simplifications. However, there are no serious obstacles to 
considering any other asset in this role. 

In this connection we discuss now the following situation. 

Let S° = (S9) and S! = (S}) be two assets. We assume that 


Si = Shela, i=0,1, 


where Hå = 0, and let AZn = Zn-1âNn and AHi = eAHn _ 1, 
Setting Sl and S9 to be constants we consider the ratio 


S! (Sa 
So se n>0 
1 


and ask whether S € MP). (To avoid the question of the existence of a measure P 
such that P | Fn = Pas where P,, (dw) = Z,(w)P,(dw), we can assume that n < N 
and F = Fy.) 
Clearly 
ga Soy -6(H). 6-1(°) . E(N) (23) 
go” go"? 


(we use the coordinate-free notation). It is easy to verify directly that 


&€-1(H°) = €(—-H*), (24) 
where z pee : 5 
So, 08 (AHP (- AH? ) 
Ay = L-NY ~ Ee 25 
= 2 IAA PEEN T ae 
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1 
Hence = is a product of three stochastic exponentials: 
si S4 ay 
302 = 5070" -&(H")- &(-H"*)- E(N), (26) 


so that using Yor’s formula (18) we successively obtain 


~ 


E(B!) €(-H*) - E(N) 


= 6(A- 24.4 > 
ree 


(AHO — Af} (AAP - a 


= 27 
1+ AH? (27) 


: is : st 
Hence one necessary and sufficient condition on N ensuring that 30 € M(P X} 
is the following inclusion: 


A- AY (AH? — AHJ (AH? — AN,) 


a E€ Mioc(P), (28) 
5 : 
ke 1+ AH; 
or, equivalently, 
(AĤ! — AH°)(1 + AN,) 
(Geena eres: ames E€ Mio,(P). (29) 
1+ AH} k>1 

Since for d assets S!,...,S% the question on the martingale property of the vector 


of discounted prices 


S ao St 

50 = | G0 50 
can be answered by component-wise analysis, we obtain from (29) the following 
general result. 
THEOREM 1. Let (S°,S1,...,S¢) be d+1 assets defined on the filtered probability 
space (Q, F, (Fn)ngn: P) with F = Fy such that 


S = She», where Hi=0, i=1,...,d, 1<n<QN, 


or, equivalently, i 
AS), = Sh-14H}, (30) 


where ay f P 
AĤi = e^Ħn 1, He =0. (31) 


Assume also that the constants S$ are positive for all i = 0,1,...,n. 


3. Construction of Martingale Measures 475 


Further, let Z = (Zn)ocn<n be a sequence of random variables such that 


AZn = Zn-1âNn, Zo =l, (32) 
where AN, > —1. 
Then the ratio 50 is a d-dimensional martingale with respect to a measure Py 
such that 7 
Py (dw) = Zy(w) P(dw) (33) 
if and only if for all i and n, i = 1,...,d and 1 < n < N, we have (P-a.s.) 
~n A 
p EE AAO EAN] g] <o z 
1+ AH9 i 
and 
p| AMi Sih AY | Fa-1| =0. (35) 
1+ AH? i 


COROLLARY. Let S? = (S?) be a ‘risk-free’ asset in the sense of the ¥,,—-mea- 
surability of the S} (for example. S? = B can be a bank account with fixed interest 
rate (AHO = r)). Then (in view of the ¥,-\-measurability of the A?) condi- 
tions (34) and (35) can be written in the following form: 


EJJA RL (1+ ANn)||Fn-1] < œ (36) 
and 
E[ ARÈ(1 + ANn) | Fn-1) = Af} (37) 


fori = 1,...,d and n between 1 and N. 
In particular. if AH® = 0. then these conditions have the following form: 


E[|A F} (1 + ANn)| | Fn] < œ, (38) 
E| ARE (1 + ANa) | Fn-1] = 0, (39) 


i.e.. are the same as earlier obtained conditions (19) and (20). 
If. in addition. AN, = 0. then the conditions in question reduce to 


E(IAHa|| Fn-1] < œ (40) 
and 
E[ AR |Fn-1] =0, (41) 


moreover. (41) is equivalent to the condition 


Eje | Fa] =1 


(cf. (11) in §3c). which is an obvious condition of the inclusion Ste M(P). 
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6. We consider now several examples illustrating the above criteria. To this end 
we observe first of all the following. 
Assume that a (B, S)-market is formed by two assets: a bank aceount B = (Bn) 
such that 
AB, = TrBn-1 


with ¥,—1-measurable rn and stock S = (Sn) such that 
ASn = prSn-1 


with Fn-measurable pn. Then the conditions (36) and (37) can be rewritten as 
follows: 


E[len(1 + ANn)|| Fn—1] < 00, (42) 
E| pn(l + ANa) | Fn—1] = fn- (43) 
On the other hand, setting 
Bn = By-1e™ and Sp = Snee”, (44) 
we can rewrite (36) and (37) as 


E(I(e" — 1)(1 + ANq)| | Fn—1] < œ (45) 


and 
E[ (eP” = 1)(1 +AN,) | Fn-1| = e™” = 1; (46) 


EXAMPLE 1. We consider a single-step model (44) with n = 0 or 1, where we set 
Fo = {, Q} and Zo = 1. Then 1+ AN, = Z1 and by (46), 


EeP! Z] =e’. 


We shall assume that Q = R, pi(x) = z, Zı = Z(x), and let F = F(x) be 
the probability distribution in Q. Then the above condition is equivalent to the 
equality 


i e* Z1(x) dF (xr) =e". (47) 


Thus it is clear that finding all distributions F = F(x) equivalent to F = F(x) 
(in the sense of the equivalence of the corresponding Lebesgue-Stieltjes measures) is 
the same as describing all positive solutions Z1 = Z (x) of equation (47) satisfying 
the condition 


I Eo (48) 
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For instance, if F ~ M (m,a?) with a? > 0, then (45) and (46) take the following 
form: 


ic: e7 21 (2)Pm o2 (2) da = e", T Z1(2)Pmo2(x) dr = 1, (49) 


1 zd (z-m)? 
e 2s? is the density of the normal distribution. 
V 2104 


From (49) we see that if we seek a martingale measure in the class of normal 
distributions N (m, 0) with F? > 0, i.e., if 


where Ym o2 (2) = 


Zy(2) = S, (50) 


then ‘admissible’ pairs (Mm, g?) must satisfy the condition 


D 
I e pm ale) dr = €", 
—-oO 
which is equivalent to 
m+ — =r. (51) 
2 

In other words all the pairs (Mm, 7?) with Z? > 0 satisfying (51) are admissible. 
We have already encountered this condition with r = 0 in § 3c (see (14)). 

Note that, besides the solutions Z; (£) of the form (50), the system (49) has other 
solutions, and their general form is probably unknown. This shows the complexity 
of the description problem for all martingale measures even in the case of the above 
simple ‘single-step’ scheme. 

Our next example can be characterized as ‘too simple’ in this respect, for we 
have there a unique, easily calculated ‘martingale’ measure. 


EXAMPLE 2. The CRR-model. Let 
AB, = rBy-1; 
: k (52) 
ASn = PrSn-1, 
where n < N and Bo and Sọ are positive constants. 
It is assumed in the framework of this model that (pn) is a sequence of inde- 
pendent identically distributed random variables taking two values, b and a, such 
that 
-l<a<cr<b 


and 
Pylon =b) =p, = Pn(pn = 2) =4, (53) 
where 0 <p<landpt+q=1. 
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Since the variables pn, n > 1, are the unique source of ‘randomness’ in the 
model, our space of elementary outcomes can be the space Q = {a, by} of sequences 


(x1,--.,0N) with z; = a,b, and the functions pp = pp(z), £ = (11,---,2N), can 
be defined coordinate-wise: pp(z) = In. 
We can define in a standard way a probability measure Py = Py (21,.-.,2Nn) 


such that p1,..., py are independent variables with respect to Py and (53) holds; 
namely, 


Pr (x1, ee rn) = poir oEN) gN -rolt EN) 


N 
where (21,...,0N) = & I,(z;) is the number of the x; equal to b. 


We shall construct ne measure Py ~ Pyi in several steps: we set Pn = Py | Fn, 
where Fn = 0(p1,---,Pn), and define Pi, Po, en PN by the formulas 


P,(21,---,2n) = Za(a1,--+;0n) Panl Tie Tn) 


where we shall (successively) find the Zn from condition (43), which, bearing in 


mind that 1 + AN, = -——, can be rewritten as 


Z 
Zn-1 


Ep, [n E i] = =r. (54) 
For n = 1 (with Fo = {Ø, Q}) we obtain by (54) that 
pbZ1(b) + qaZı(a) =r, (55) 
which, together with the normalization condition 
pZı(b)+4Zı(a) = 1, (56) 


delivers with necessity the equalities 


1 1 
— . — Z as _— 
Zı(b) E and Zı(a) aa (57) 
We set 
~ r—-a nd ~ ber 
peg ee oe ang 
Then - 
P1(6) = Z1 (b) P1(6) = P, 
1(b) (b) P1(b) = (58) 


Pi(a) = Zı(a) P1 (a) =q. 
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To find P2 (and, after that, P3,...,Py) we use (54) again. Based on the fact 
that pı and pz are independent with respect to P2, we obtain by (52) that 


Z2(b, 6) a Z2(b, a) 


aC AeA) ee 


An additional condition on the values of Z2(b,b) and Z2(6, a) can be obtained from 
the martingale property 


Ep,[Z2(1, p2) | p1 =b] = 21(8), 
which brings us to the equality 


Zo(b,b)  Zo(b,a) _ 
PO +q 20 (60) 


Comparing (59) and (60) with (55) and (56) we see that 


Z2(b,6) r-a 1 pP Za(ba) q 
Z(b) b-a p p’ Zb) a 
In a sirnilar way, 
Z2(a, b) = P d Zə(a,a) z q 
Z\(a) p Zı(a) q 


Hence 


Po(a,a) = Za(a,a)q? = Z1 (a) $ - 4? = 7°, 


Po(a, b) = qP, P2 (b, a) = pq, P2(b,b) ap: 
It is now clear that the variables pı and p2 are identically distributed and indepen- 
dent with respect to the measure Po; moreover, Po(p; = b) = p and Po(p; =a)=q, 
j= 1,2. 

The next steps, the specification of P3,...,Py, proceed in a similar manner, 
which brings us to the following result. 


THEOREM 2. A martingale measure Py in the CRR model defined in (52) and (53) 
is unique and can be defined by the formula 


Pr (z1, oie ty) ae pro(tintn GN -VhT tN) (61) 
where $ 
~ r-a a -r 
== SS 2 
P bna? q TE (6 ) 
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Remark. We note that it was not a priori obvious that the martingale measure Py 
was a direct product of ‘one-dimensional’ distributions: 


Py =P1@---@Py, 
— mm 
N 
i.e., that the variables p1,..., pN, which are independent and identically distributed 


with respect to the initial measure Py, are also independent and identically dis- 
tributed with respect to the martingale measure P y. 


4. Complete and Perfect Arbitrage-Free Markets 


§4a. Martingale Criterion of a Complete Market. 
Statement of the Second Fundamental Theorem. 
Proof of Necessity 


1. In accordance with the definitions in § 1b, a (B, S)-market defined on a filtered 
probability space (Q, F, (Fn), P), where 0 < n < N, Fo = {8,9}, and Fy = F, 
is said to be complete (perfect) or N-complete (N-perfect) if each F y-measurable 
bounded (finite) pay-off function fy = fn(w) is replicable: there exist a self- 
financing portfolio m and an initial capital x such that Xf = x and 


XN = fn (P-as.). 


Let Y(P) be the collection of all martingale measures P ~ P, such that the dis- 
counted prices pa martingales. It is assumed (see § la, 2a) that B = (Bn)o<ngN 


is a risk-free asset and S = (Sn)ogngn with Sn = (Ses Sa). d < œ, is a multi- 
dimensional risk asset. 

The asset B is usually regarded as a bank account; the assets S* are called stock. 

In what follows we assume that Bn > 0 for n > 0. Then we can set Bn = 1, 
n > 0, without loss of generality. 

The next result is so important that it may well be called the ‘Second funda- 
mental asset pricing theorem’. 


THEOREM B. An arbitrage-free financial (B, S)-market (with N < œ and d < œ) 
is complete if and only if the set Y(P) of martingale measures contains a single 
element. 


Thus, while the absence of arbitrage means that 


P(P) Ae, 


482 Chapter V. Theory of Arbitrage. Discrete Time 


the completeness of an arbitrage-free market can (provisory) be written as 
|A(P)| = 1. 

We now make several observations relating to the proof of this theorem. 

It is well known in stochastic calculus (see, e.g., [250; Chapter III]) that the 
uniqueness of a martingale measure is intimately connected with the issues of ‘rep- 
resentatability’ of local martingales in terms of certain basic martingales. The 
corresponding results (especially in the continuous-time case) are considered to be 
technically difficult, because one must essentially use ideas and tools of the stochas- 
tic analysis of semimartingales and random measures in their proofs. 

Incidentally, in the discrete-time case this range of issues pertaining to the ‘rep- 
resentability’ problem for local martingales and the completeness of a (B, S)-market 
can be discussed on a relatively elementary level. We start our discussion with the 
case of d = 1 (§§ 4a-4e). The general case d > 1 is considered in § 4f. 


2. The idea of the proof of Theorem B in the case of d = 1 is to establish the follow- 
ing chain of implications (involving the concepts of ‘conditional two-pointedness’ 
and ‘S-representability’, which are introduced below, in § § 4b,e, and the equality 
Fy, = FS meaning that the o-algebra Fn coincides with the o-algebra 


FS = 6(S1,..-,Sn) 


generated by the random variables S1,...,Sp up to sets of P-measure zero): 


completeness 


w| 


|A(P)| =1 
a Ne 


‘S-representability 


~ 
g 
Po mi 


completeness 
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Implication {1}, i.e., the necessity in Theorem B, has a relatively easy proof 
that proceeds as follows. 
Let A € Fy. We set fy = I4(w). In view of the assumed completeness there 
exists a starting capital x and a self-financing strategy m such that if Xf = z, then 
y = fn (P-as.). 
Since 7 is self-financing strategy, it follows that 


n 
X3 =X +) WAS. 
k=1 


Let P;, i = 1,2, be two martingale measures in the family #(P). Then (Xf )n<N 
is a martingale transform, and since Xf; = I4, the sequence X7 = (Xf )ncn isa 
martingale with respect to each martingale measure P;, 7 = 1,2, by the lemma in 
Chapter IT, § 1c. 

Hence 

z = X§ = Ep;(XẸ | Fo) = Ep,a = Pil A) 
for i = 1,2, so that P(A) = P9(A), A € Fy. 

Thus, the measures Pı and P2 are in fact the same, which proves that the set 
YP) (non-empty due to the absence of arbitrage in our (B,S)-market) has at 
most one element (|{(P)| = 1). This proves the necessity in Theorem B (implica- 
tion {1}). 

In the next section we consider the issues of ‘representability’, which are impor- 
tant for the discussion of implications {4} and {5}. 


§4b. Representability of Local Martingales. ‘S-Representability’ 


From the standpoint of the ‘general theory of martingales and stochastic calcu- 
lus’ (see [102], [103], [250], [304]) the assumption of ‘completeness’ is in fact equiva- 
lent to the so-called property of ‘S-representability’ of local martingales ([250; Chap- 
ter III]). 

DEFINITION. Let (Q, F, (Fn), P) be a filtered probability space with 

a d-dimensional (basic) martingale S = (Sn, Fn, P) 
and 

a (one-dimensional) local martingale X = (Xn, Fn, P) 
Then we say that the local martingale X on (Q, F, (Fn), P) admits an ‘S-repre- 
sentation’ or a representation in terms of the P-martingale S if there exists a pre- 
dictable sequence y = (yn), where yn = (y1, - --, yÉ), such that 


Xn = Xo + S nis, (= Xo + (Zast) (1) 


k=1 k=1 \j=1 
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P-a.s. for each n > 1, i.e., X is a martingale transform obtained from the P-mar- 
tingale S by ‘integration’ of the predictable sequence y; see Chapter II, § lc. 


The next result relates to implication {5} in the chain of implication on the 
previous page. 


LEMMA. Let (B,S) be an arbitrage-free market over a finite time horizon N, let 
Bn = 1 for n < N, and let Y(P) be the family of measures P equivalent to P 
(on (Q, F), where F = Fy), such that S = (Sn)nz0 is a P-martingale. 

Then this market is complete if and only if there exists a measure Pe YIP) 
such that each bounded martingale X = (Xn. Fn, P) (with |Xnlo) SC n sN 


w E Q) on (Q, F, (Fn), P) admits an ‘S-representation’. 


? 


Proof. (a) Assume that our (arbitrage-free) market is complete. We take an arbi- 
trary measure in (P) as a required measure P. Let X = (Xn, Fn, Py ncn bea 
martingale with |Xn (w| <C, nS N, wER. 

We set fy = Xn. The completeness assumption means that there exists a 
self-financing portfolio m and an initial capital z such that (P- and P-a.s.) 


n 
Xt =r > wAS, (2) 
k=1 


and Xf, = fy = Xy. However, |fy| < C, therefore X7 = (Xñ )ngn is a P- 
martingale (the lemma in Chapter II, § 1c), so that the P-martingales X™ and X, 
which have the same terminal pay-off function fy, actually coincide (P- and P-a.s.). 
Hence the martingale X admits an ‘S-representation’. 

(b) Now let fy = fn (w) be a ¥y-measurable bounded function (|fy| < C < œ 
P-a.s.). We claimn that there exist a self-financing portfolio 7 and a starting capital x 
such that X% = fy (P-a.s.). 

By assuinption, there exists a measure Pe Y(P), such that each bounded 
P-martingale has an ‘S-representation’. 

We consider one such martingale 


X=(XnFuP)ngn, where Xn =Es(fn | Fn). 


Since |f] < C, X is a bounded (Lévy) martingale and it has a representation (1) 
with some #,_-measurable variables y}, j = 1,...,d, k < N. 
For these variables we construct a portfolio 7* = (@*,y*) such that y* = y and 
d oe ta 
Ba = Xn- 2 ASh. 


j=l 
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By (1) we obtain that the - are #,;-measurable. Moreover, 


d 
SOS Ay! + Ast = Zs ae dah + (AX, ~a(S7si)) 
j=l 


geek 
d d 
-5s i Ayi + NO yasi - a si) = 0. 
j=l j=l 
Hence 7* is a self-financing eas and 
= = GB, ay 3 y Si = = 


in particular, X% = Xy = fn (P- and oe i.e., our (B, S)-market is complete, 
which proves the lemma. 


Remark. If we do not assume that Bn = 1, n < N, then all our results are valid 
Sn 


for the P-martingale D (= 
Bn 


) in place of the P-martingale S = (Sn)ngN- 
B n<N 
§4c. Representability of Local Martingales 

(‘u-Representability’ and ‘(—v)-Representability’) 


1. The issue of ‘S-representability’ is, as shown in the preceding section, closely 
related to the ‘completeness’ of the corresponding market and the fact that the 
evolution of the capital X7 is described by (2). 

As shown in § 4d, we have ‘S-representability’ in the CRR-model, so that the 
market is complete in this case. Generally speaking, completeness (and therefore 
also ‘S-representability’) is an exception rather than a rule. It is reasonable therefore 
to consider here another kind of representations of local martingales, which uses the 
concepts of random measures u and martingale random measures p—v; see § 3e. It 
will be clear from what follows that ‘u-representations’ and ‘(4—v)-representations’ 
are significantly more widespread than ‘S-representations’. Hence it often rnakes 
sense to find a ‘w or *(u—v)-representation’ first and attempt to transform them 
into an S-representation after that. 


2. Let S = (S1,..., SÍ) be a d-dimensional martingale on a filtered probability 
space (Q, F, (Fn), P) with Fo = {@,Q} (with respect to the original measure P 
and the flow (¥n)). 
Let $ i 
FS =0o(SÍ, k <n, j=1,...,d), 
> 1, be the o-algebra generated by the prices and let 
X = (Xn, FS, P) 


be a local martingale. 
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The increments AX, = Xn — Xn-1 are ¥5-measurable, therefore there exists 
a Borel function fn = fn(r1,---,2n), Ti € Rt such that 


AX,,(w) = fn(AS1(w),---, ASp(w)), wEQ. 
(Since we have set Fo = {@,Q}, the vector So is not random, and there exists a 
one-to-one correspondence between (S1,.--, Sn) and (AS),..., AS,).) 
For each n > 1 we now set 


W,(w, 2) = fal ASi (w), ASp—1(w), 2). 


This function is clearly measurable in (w, £), @(R%)-measurable in x for each w € Q 
and FS_,-measurable for each z € RÊ. 


Let un(A;w) = I(ASn(w) € A), A € (RÌ), be the integer-valued random 
measure constructed from the increments AS,,(w), n > 1. Then 


AXn(w) = | Wales) unless, (1) 


so that we obtain the so-called ‘y-representation’ of X: 
n 
Xn(w) = Xo(w) + Do I, W,,(w, x) up (dr; w), (2) 
k=1 


or, in a more compact form, 
X=Xo9+Weuyu (3) 


(see § 3e). 
We now note that since X is a local martingale by assumption, it follows that 
E(IAXn| | F%2_,) < œ and E(AX,, | ¥3_,) = 0, n > 1. Hence if 


Un(A;w) = E(un(A; -) | ¥3_1)), (4) 


then we can see that 
Í, W,(w, 2) vg(dr;w) = E(AX; | FS (w) = 0. 
R 


Thus, besides (2) and (3), we obtain the so-called ‘(y—v)-representation’ 


Xalo) = Xolw) + > | Welo) (un(driw) —ve(desw)), (8) 
rar "F 
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or, in a more compact form, 
X = Xp +W *«(u-—v). (6) 


In may seem odd that, alongside the natural representation (3), we also consider 
the representation (6), which can be obtained from the first one in a trivial way, 
and honor it with a special name ‘(y—v)-representation’, assigning to it certain 
importance by doing so. The reason is as follows. 

First, in some more general situations (continuous time, more general flows 
of o-algebras, etc.) similar representations of local martingales involve stochastic 
integrals just of the kind W x (u — v) (see, e.g., [250; Chapter III, § 4.23, 4.24]). 
Second, the expressions of type W » (u — v) (as compared with W x u) have one 
advantage: in general, there is no unique way to define the function W in these 
representations, while in expressions of the type W * (u — v) these functions can 
often be chosen pretty simple. 

We present the following example as an illustration. 

Let A € @(R®) and let XA) = (xX) gS, P) be a martingale with x” =0 
and 


AX) (w) = un(A;w) — vn( A; w) = I(ASp(w) € A) — E(I(ASn) € A| F5_1)(w). 
If we set WL) (w, 2) = I(x), then 
Ax) (w) = / WA) (w, 2) (un (da; w) — vn(dz;w)), 
Rd 


so that 
XA) = Wl) x (u-v). 
On the other hand, setting 
Wr(w, 2) = La(r) — E(La(ASp) | Fai) () 


we obtain 
f Walwa) nds) = pn Ase) ~ lA; w) = AXE (u), 
R 
so that 
X4) -Wx H- 


Clearly, the function W (^) is simpler as W. 

These arguments show, in particular, that the integral W * u does nor change 
after the replacement of the functions Wp (w, £) by Wn (w, £) + gh (£), where gj,(x) 
satisfies the equality 


[ In(£) bin(dz;w) = 0. 
Re 
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In turn, the integral W * (u — v) does not change if we replace W,,(w,2) by 
W,,(w, 2) + g} (w) because 


3 gn(z)(un(dz;w) — vn(dz;w)) = 0. 
R 


In the next section we show how one can easily deduce an ‘S-representation’ 
from a ‘(u—v)-representation’ in the CRR-model of Cox—Ross- Rubinstein. 


§ 4d. ‘S-Representability’ in the Binomial CRR-Model 


1. As shown in §3f (Example 2 in subsection 5), there exists a martingale mea- 
sure in the CRR-model (so that the corresponding market is arbitrage-free), which 
(assuming that discuss the coordinate probability space) is the unique martingale 
measure. By Theorem B, this is equivalent to the completeness of the corresponding 
market. 

It would be interesting for that reason to trace down how the uniqueness of the 
martingale measure in this particular model delivers the ‘S-representability’ and, 
therefore, the completeness of the market (following by the lemma in § 4b). 

First we recall some notation. 

As explained in Chapter I, § le, the CRR-model of a (B,S) market defined 
on a filtered probability space (Q, F, (Fn), P)nz0 is described by two sequences 
B = (Bn)nz0 and S = (Sn)nzo such that 


Bn = Bn-1(1+rn), (1) 

Sn = Sp-1(1 + Pn), (2) 
where the rp are ¥,~1-measurable, the pn are Fn-measurable, and By > 0 and 
So > 0 are constants. 


Since 
Sn Sn-1 . 1+ Pn 


Bn Bn-1 1+tn 


; (3) 


S a 
it clearly follows that (=) is a martingale with respect to a measure P if, first, 
n20 


n 
a 


1+ Pn 
l+Trn 


é es 


(where E is averaging with respect to P) and, second, 


(1e | ¥.-1) =. (4) 


l+rn 


In view of the Fn~1-measurability of the variables rn the condition (4) reduces to 
the equality 


Elon | Fast) = fn- (5) 
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2. In the Binomial C'RR-mnodel one has rp = r, where r is a constant and (~n)n>1 
is a sequence of independent identically distributed random variables taking two 
values, b and a, with positive probabilities 


p=P(pn=6) and q=P(pn =a), (8) 


p+q= 1. (We also agree that a < b.) Finally, let Fn = o(p1,.--,pn) forn > 1 
and let Fo = {8, Q}. 
Sn 


S 
If we require that the sequence — = (= 
n 


) be a martingale with respect 
B n20 


to a measure P such that P [2° P, then the quantities py, = P(pn = = b) and gn = 
P(pn = = a), in view of the equality Emn = =r, must satisfy the condition 


bPn F afn =r, 
which, in view of the normalization pp + qn = 1, shows that 


x POS eee!) we) Aaa bET 
Pe Te y > In =G= : 


(7) 


In order that these values be positive one requires that a < r < b. 

We shall also assume that a > —1. Then Sn > 0 for all n > 1 because So > 0. 

Let X = (Xn, ¥n,P)n>o0 be a martingale and let Fo = {9,Q} and Fn = 
o(P1,---3Pn) for n > 1. S 

For n > 1 we set un(A;w) = I(pn(w) € A) and n(A) = Eun(A;w). Since pn 
takes only two values, the measures un(-;w) and Dn (: ) are concentrated at a and b. 
Moreover, 

un({a}iw) =1(pn(w) =a),  Mn({a})=@ 

and 


un({b}iw) =I (pn(w) =b), — Mn({b}) = P- 


Let gn = gn(£1,---,2n) be functions such that 


Xn(w) = gn (p(w), +», Pn(w)), 


and therefore 
AXn(w) = gn (p1(w), e’ pn(w)) = gn-1(P1 (w) Pn-1(w)). 
Since E(A Xn |-Fn—1) = 0, it follows that 


P: gn(pr(w), Pas ., Pn—1(w), b) +q: gn(pi(w), ee Pn—1(w), a) 
Span) 
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or, equivalently, 


gn(piw), =: -;Pn—1(w), 6) 7 9n-1 (er), tees Pn—1(w)) 


q 
= gn—1(p1(),-- +s Pn—1w)) — gn(e1(),---, Pn—1(w).@) (8) 
P 
In view of (7), this can be rewritten as follows: 
Inlpilw), ---, Pn—1(w), b) — gn—1(p1), +--+ Pn—1(w)) 
b-r 
— gn(piw),---»Pn—1(w), a) — gn—1(p1(w),.--, Pn—1(w)) (9) 
a-r ` 
We proceed now to ‘u-representations’. In accordance with (1) in § 4c, 
AXn(w) = Wal, pn(w)) = f watea) unldz;w), (10) 
where 
Ww, £) = ga(p1(w),-- -Poz (w) £) — Gn—1(P1(), -3 Pn-1))- 
Setting 
Wr (w, x) 
Wr (v2) = ~, (11) 
we see from (10) that 
AX, (w) = [l= Wil.) unde (12) 


Note that, in view of (9), the function W; (w, x) is independent of x. Hence, 
denoting the right-hand side (or, equivalently, the left-hand side) of (9) by yp (w), 
we obtain 


AXn() = lo) f (=) un(dse) = Hle) =r): (13) 


Thus, for X = (Xn, Fn, P) we obtain the representation 


Xn(w) = Xow) + D> AW) (PR) — r). (14) 
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Since 


A Sn _ Sn-1 Paor 
Bh Bh-1 l+r , 


it follows that 


poor a (itrigeta(S), (15) 
and therefore , 
Xalo) = Xot) + Y nea ( SE), (16) 
k=1 
where B 
V4 (w) = Te(w)(1 + 7) (17) 


are #,_ -measurable functions. 
S . ~ 
The sequence (2) is a martingale with respect to the measure P. Hence 
k/ k20 
(16) is just the ‘S-representation’ of the P-martingale X with respect to the (basic) 
P-martingale (4) 
k/ k>0 

Using the lemma in § 4b we see that the (B,S)-market described by the CRR- 

model is complete for each finite time horizon N. 


3. Remark. It is worth noting that it was essential in our proof of the uniqueness 
of the martingale measure in the CRR-model in § 3f (Example 2 in subsection 5) 
that the original probability space was the coordinate one: Q = {x = (21, 22,..-)}, 
where z; =a or z; = b; Fn = o (£1,..., £n) for n > 1, and F = V Fn. It can be 
seen from what follows (see § 4f) that in an arbitrary filtered space (Q, F, (Fn), P) 
the condition that a martingale measure be unique implies automatically that 
Fn =0(S1,..., Sn) for n > 1 (up to sets of P-measure zero). 


$ 4e. Martingale Criterion of a Complete Market. 
Proof of Necessity for d = 1 


1. In accordance with the diagram of implications in § 4a.2, to prove the necessity in 
Theorem B (i.e., the implication ‘|2(P)| = 1’ => ‘completeness’) we must verify 
implications {2}, {3}, and {4} in this diagram. (We recall that we established 
implication {5} by the lemma in § 4b and d = 1 by assumption.) 

We start with the proof of implication {4}, where we assume that Bn = 1 (and 
therefore rp = 0) for n > 1, which brings no loss of generality, as already mentioned. 

To this end we point out that it was a key point of our proof of the ‘S-re- 
presentability’ for the CRR-model that the probability distributions Law (pn |P), 
n > 1, were concentrated at two points (a and b, a < b). 


492 Chapter V. Theory of Arbitrage. Discrete Time 


In other words, it was important that these were ‘two-point’ distributions. The 
corresponding arguments prove to be valid also for more general models once 
the (regular) conditional distributions Law(AS,,|¥%p-1:P) or, equivalently, the 


conditional distributions Law (pn | Fn—1; P) with pn = z Z are ‘two-point’ and 


n—1 
Fn =0(S1,...;Snh n21. 
Formally, the ‘conditional two-pointedness’ means that there exist two pre- 
dictable sequences a = (an) and b = (bn), of random variables an = an(w) and 
b = bn(w), n > 1, such that 


P(pn = an| Fn-1)(w) + Pon = bn | Fn—1)(w) = 1 (1) 


and an(w) < 0, bn(w) > 0 for allw € Q, n > 1. (If the values of an(w) and bn(w) 
‘merge’, then we clearly have a,;,(w) = b,(w) = 0; this is an uninteresting degenerate 
case corresponding to the situation when AS,,(w) = 0. We can leave this case out 
of consideration from the very beginning and without loss of generality.) 


Let nw) = P(pn = bn | Fn—1)(w) and let Gn(w) = P(Pp = an | Fn—1)(w): 
_ The martingale property of the sequence S = (Sn, Fn, P) delivers the equalities 
E(pn | ¥%n—1) = 9, n > 1, which mean that 
bn(w)pn(w) + an(w)gn(w) = 0, n>1. (2) 
By (1) and (2) we obtain (cf. formula (7) in § 4d) 


bn (w) 
bn(w) — dn(w) ” 


ds —an (w) 


Pn(w) = bn(w) — an(w) and Gn(w) = 


(3) 
(If an(w) = bn (w) = 0, then we agree to set Pn(w) = dn(w) = 2.) 

Let X = (Xn, FS, P) be a local martingale and let gn = gn(£1,.--, £n) be 
functions such that Xn(w) = gn(pi(w),...,Pn(w)). By analogy with formula (8) 
in § 4c we obtain 


gn(p1(W),--+, Pn—1(W), bn(w)) = gn—1 (P1 (w); --- Pn—1(W)) 
Gn) 
— Ina (pı (w), Pn-1(w)) = Inl p(w), -s Pn-1 (w), an(w)) 
Pr(w) 


(4) 


Further, following the pattern of (9)~-(17) (§ 4c) we see that X has an ‘S-represen- 
tation’: 
n 
Xn = Xo + X ye (w)ASK(w) (5) 
k=1 


with FS -measurable functions y(w), k > 1. 
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Thus, implication {4} is established. 

We consider now the proof of implication {2}, which says that the uniqueness 
of the martingale measure means the ‘conditional two-pointedness’ (for d = 1). 

Taking into account that one can operate the conditional probabilities 


P(AS, € -| Fn—1)() 


like ordinary ones (for each fixed w € Q), the required ‘conditional two-pointedness’ 
is equivalent to the following result. 
I. Let Q = Q(dz) be a probability distribution on (R, @(R)) such that 


fil Q(dzr) < œ, fe Q(dr)=0 


(‘the martingale property’). Let (Q) be the family of all measures Q = Q(dr) 
equivalent to Q = Q(dz) and having the properties 


[isi@tdr) <0, fe Q(az) =0. 


If the fanily P(Q) contains only the (original) measure Q, then, of necessity, 
this is a ‘two-point’ measure: there exist a < 0 and b > 0 such that 


Q({a}) + Q({5}) = 1, 


although these points can ‘merge’ at the origin (a = b = 0). 
We can put this assertion in the following equivalent form: 
II. Let Z(Q) be the class of functions z = z(z), x € R, such that 


Ofer 0 < z(z)< œ} =I, 
J |je|z(z)Q(dz) < œ, f r2z(x) Q(dr) = 0. 
R R 
We assume that, for some measure Q, the class Z(Q 


(Q 
that are Q-indistinguishable from one (Q{z: z(x) # 1} 
Q is concentrated at two points at most. 


) contains only functions 
= 0). Then, of necessity, 


Finally, the same assertion can be reformulated as follows. 

III. Let € = €(x) be a random variable on a coordinate space with distribution 
Q = Q(dr) on (R, A(R)). 

Assume that E|é| < œ, Eg = = 0, and let Q be a measure such that if Q~ Q, 
E\é| < œœ, and Eé = 0, then Q = Q. 

Then the support of Q consists of at most two points, a < 0 and b > 0, that can 
stick together at the origin (a = b = 0). 
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To prove these (equivalent) assertions we observe that each probability distri- 
bution Q = Q(dr) on (R, @(R)) can be represented as the mixture 


€1Q1 + c2Q2 + c3Q3 


of three: a purely discrete one (Qı), an absolutely continuous one (Q2), and a 
singular one (Q3), with non-negative constants c1, cz, and c3 of unit sum. 

The idea of the proof becomes already transparent in the ‘purely discrete’ 
case, when we assume that Q is concentrated at three points x_, ro, and z4, 
xr— <S zo < z4, with non-zero mass p_, po, or p+ at each point. 

The condition E£ = 0 means that 


xr-—p— + Topo + r4p4 = 0. (6) 
If zo = 0, then (6) assumes the form r_p_ + r4p4 = 0. 
We set 1 
~ P- ~ PO ~ P+ 
- = — = — aia = —_— 7 
p z? BOS Sry Pee oe (7) 


which corresponds to ‘pumping’ parts of the masses p_ and p4} at z_ and z+ to 
the point ro = 0. Q 

It is clear frorn (7) that the measure Q = {p_,po,p4} concentrated at the three 
points x_, zo, and z4, is a probability measure, Q ~ Q, and E€ = 0: moreover, 
Q # Q, which contradicts the uniqueness of Q. 

Thus, Q cannot be concentrated at three points including zo = 0. 

Now let ry # 0. Then the idea the construction of a measure Q ~ Q by ‘mass 
pumping’ from x— and z+ to zo can be realized, e.g., as follows. 

We set 


pP-=p-~€-,  Po=po+(e-+e+), Pe=P+-€, 


For sufficiently small £— and £+, Q = {p_,po,p+} is a probability measure and we 
must show that we can choose positive coefficients e_ and e4 such that E = 0, 
i.e., 


E-p—+xopot+r4p4 = (1-p-+Topo+zr4p+)—~(e-x-+(e—+e4)ro—E}r+)= 0. 


Since E£ = x_p— + topo + z+p+ = 0, these positive coefficients £— and £+ must 


satisfy the equality 
E+ _ T07 F_ 


E— £4 — z0 


We set à = 2 T- (>0). Then it is clear that choosing sufficiently small ¢_ 
T To 


first and setting £+ = Ae_ after that we can achieve the inequalities p_ > 0, po > 0, 
and p4 > 0. 
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Hence Q = {P_,Po,p+} is a probability measure, Q~ Q, Q # Q, and Eg = 0, 
which is again in contradiction with the uniqueness of the martingale measure Q, 
so that all the three masses p_, po, and p+ of the distribution Q cannot be positive. 

This construction is easy to transfer to the case when a purely discrete martin- 
gale measure Q is concentrated at a countable set {2,;,7 = 0,+1,+2,...} 

<l t-z < t-i < 4% <. ti < 2 < :- 
{pi i= 0, +1, +2,...}. 

If the set {z;, i = 0,+1,+2,...} contains the origin, say, zo = 0, then we must 
set 


? 


-, with respective probabilities 


< Pi g 
P= i #0, 


and 


Then Pi = 1 and Eg = Lo ripi = $D ripi = 0. 
KA l a 


The measure Q = {p;,i = 0,+1,+2,...} is a probability measure, Q ~ Q, 
Q #Q, and E¢ = 0, which is incompatible with our assumption that the martingale 
measure is unique. 

Now let {2,;,7 = 0,+1,+2,...} be a set of non-negative points z; #0. We 
construct a new distribution Q = {pi i = 0,+1,+2,...}, where p; = p;i for 
i = +2,+43,... and, as above, 


p-1 =p-1-€-1, D41 = P41 — €41, Po = po + (€-1 + £41). 


Then 


Eé = E€é —e_yr_-y + (€_1 + £41)20 —€41%41 = £4(Z0 — r41) + E_1(20 ai r1), 


and the same choice of ¢_; and £41 as in the above case of three points (z—, x9, £4) 
brings us to a new martingale measure Q distinct from Q but equivalent to it, which 
contradicts the assumption of the uniqueness of the martingale measure Q. 

In a similar way we can consider cases where Q has absolutely continuous or 
singular components. 


2. We now turn to the proof of implication {3}. We claim that the uniqueness of 
the martingale measure means that the -algebras ¥, are generated by the prices S: 


Fn = FS = o (S0, ., Sn), nS N. 


We shall proceed by induction. (Note that the o-algebras Fo and FÈ are the same 
since, by assumption, Fo = {, Q} and Sọ is a non-random variable.) 

Let (Q, F, (Fn), P)ngn bea filtered probability space and let S=(Sn, Fn, P)n<N 
be a sequence of (stock) prices with Sn = (S1, ..., S2). To avoid additional notation 
we shall assume that P is itself a martingale measure: 
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Assuming that Fn-1 = FSi we consider a set A € Fn. Let 
1 s 
z=1+5(I4—E(Ia|%a))- (8) 


Clearly, 4 < z < 3 and Ez = 1. Hence the measure P’ with P’(dw) = z(w) P(dw) is 
a probability measure and P’ ~ P. Let z; = E(z|¥;). By Bayes’s formula (see (4) 
im § 3a), 


Ži 


E'(AS;| Fi-1) = e( AS; 


Fi | (9) 
In view of our assumption that Fn—1 = FS, it follows from (8) that E(z | Fn—1) 
I = 1 for i Én, 


Ži—1 


= 1, At the same time z is a ¥,-measurable function. Hence 
so that E’(AS;| Fi—1) = 0 for all i # n. 
Since ——— = Z, E(z| 92.3) = 1, and AS), are #5-measurable, it follows 
Zn—1 
by (12) that 
E(ASn | Fn—1) = E(2ASn | Fn—1) 
= E(zAS,, | ¥3_,) = E(E(zASn | FS) | FS) 
= E(ASpE(z| F$) | F1) = E(ASn | FR) = 0, 
where we also use the equality E(z|.%°) = 1 holding by (8). 
Hence the sequence of prices (Sn, ¥n)ncn is a P’-martingale. 
Thus, the assumption that P is a unique martingale measure brings us to the 
equality z = 1 (P-a.s.). so that for each A € Fn we obtain 


I4 =E(I4|¥%2) (P-as.) 
by (11). Hence, Fn = FS up to sets of P-ineasure zero. 


Using induction on n we can verify these relations for all n < N, which proves 
implication (3). 


Zi—1 


3. Thus, the uniqueness of the martingale measure P ensures the conclusions of 
the implications {2} and {3}. Combined, they mean ‘S-representability’, which, in 
turn, delivers the cornpleteness of the market (by the Lemina in § 4a). This done, 
we have established the sufficiency part of Theorem B (for d = 1). 


Remark 1. It is worth noting that, as shows the above proof of Theorem B, a 
discrete-time complete arbitrage-free market (with N < oo and d = 1) is in fact 
discrete also in the phase variable in the following sense: the o-algebra Fy is 
purely atomic (up to P-measure zero) and contains at most 2N atoms. This is an 
immediate consequence of a ‘conditional two-pointedness’. (In the case of arbitrary 
d < co the number of atoms in Fy is at most (d+ 1)%.) 


Remark 2. The fact that for N < œ and d < œ the o-algebra Fy corresponding 
to a complete arbitrage-free market has at most (d + 1)% elements shows that 
completeness and perfection mean the same for such markets. 
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$4f. Extended Version of the Second Fundamental Theorem 


1. The above proof of Theorem B relates to the case of d = 1. (We have explicitly 
used this assumption in our proof of implications {2} and {4}.) In the general 
case of d > 1, it seems appropriate to present an extended version of this theorem 
including the equivalence of the completeness and the uniqueness of a martingale 
measure and several other equivalent characterizations. 

First, let us introduce our notation. 


7 S 
We set Sn = a (the discounted prices), 
n 


Qn( 3w) = P(ASn € -| Fn-1)(w).  Qn(-3w) = P(ASn E -| Fn—1) (wv). 

We recall that vectors a1,...,a@, with a; € RI, 2 <k<d+ 1, are said to 
be affinely independent if there exists i € {1,...,k} such that the k — 1 vectors 
(a; ~ aj), j =1,...,k, j #7, are linearly independent. If this property holds for 
some ¿ € {1,...,k}, then it also holds for each integer i = 1,...,k. Note that this 
property of affine independence of d-dimensional vectors a1,...,ap is equivalent to 
the fact that the minimal affine plane containing a,,...,a, has dimension k — 1. 


THEOREM B* (an extended version of the Second fundamental theorem). Assume 
that we have an arbitrage-free (B,S)-market with B = (Bn)ocn<n and 
S = (Sn)ocn<n; Sn = (S1,..., S2), where the Bn > 0 are ¥y_-measurable, the 
st > 0 are ¥,-measurable, N < œ, and d < œœ. Then the following conditions are 
equivaleut. 

a) The market is complete. 

b) The market is perfect. 

c) The set of martingale measure Y(P) contains a unique measure. 

d) The set of local martingale measures Pj9,(P) contains a unique measure. 

e) There exists a measure P’ in Pioc(P) such that each martingale M = 

(Mn. FnsP ncn has an ‘S-representation’ 


n 
Mn = Mo + X 7:45;, n sN, 
with ¥;1-measurable yi. t1 

f) Fn = o(S1,.... Sn) up to sets of P-measure zero, and there exist (d + 1)- 
predictable R¢-valued processes (Qin,---,@d41n); 1 Sn <S N, such that 
their values are affinely independent (for all n and w) and the supports of 
the measures Qn(-;w) lie (P-a.s.) in the set {a1.n(W),...,@a41,n(w)}- 

g) Fn = 0(S1,...,5n) up to sets of P-measure zero and there exist (d + 1)- 
predictable R¢-valued processes (G@1,n,.-. adit) 1 <n < N, such that 
their values are affinely independent (for all n and w) and the supports of 
the measures Qn(-;w) lie (P-a.s.) in the set {@1,n(w),..-,@a41.n(w)}- 

If these conditions hold, then the -algebra Fy is purely atomic (with respect 
to P) and consists of at most (d + 1)% atoms. 
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Proof. For d = 1 this was explained in the preceding sections. In the general case 
of d > 1 the corresponding proof can be found in [251]. We refer the reader there 
for the technical detail related to the fact that the prices S = (S1,...,S%) are 
vector-valued; we concentrate ourselves on outlining the proof and pointing out the 
differences between the cases d = 1 and d > 1. 

First, we note that the equivalence of f) and g) is a simple consequence of the 


relation 
= Qin 438 1 1 
in = {=> -— 
tn B: n—1 Bn Banı 


between the @;,, and the ain- 

Further, we clearly have the implication b)=>a) and by Theorem A* (§2e), 
d)<>c). 

Hence to prove the theorem we must prove that 


The implication a)=>d) can be established in the same way as for d = 1 
(see § 4a.2), where, in place of the martingale measures P;, 1 = 1,2, one must 
consider local martingale measures. 

As regards the implication c)= g), we already proved the equality 
Fn = 0($1,...,5n) in §4e.2, in our proof of implication {3} from §4a.2. (The 
corresponding proof is in fact valid for each d > 1.) 

The most tedious part of the proof of the implication c)=>g) is to describe 
the structure of the supports of the measures Q,,(-;w). For d = 1 these supports 
were ‘two-point’. In the general case of d > 1 these supports consist of at most 
d+ 1 points (in R¢). This part of the proof is exposed at length in [251] and we 
omit it here. (Conceptually, the proof is the same as for d = 1 and proceeds as 
follows. Assume that P is itself a martingale measure. If the support of Q,,(-;w) 
contains more than d+ 1 points, then, using the idea of ‘mass pumping’ again, 
we can construct a new measure P’ by the formula P’(dw) = z(w)P(dw). For a 
suitable choice of the ¥j-measurable function z(w), P’ is a martingale measure, 
P’ ~ P, and P’ # P. This contradicts, however, our assumption of the uniqueness 
of a martingale measure. In a similar way we can also prove that the R¢-valued 
variables G@1n,---,@d41,n are affinely independent.) 

We consider now the implication g)= +b). Let fy be a #y-measurable random 
variable and assume that the original measure P is itself a martingale measure. 
Then it follows from g) that, in fact, fy is a random variable with finitely many 
possible values. 
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We claim that fy can be represented as follows: 
N 
DEREDE INIS (1) 
k=1 


sy n pen 
The sequence Xn = £+) yiAS;, n < N, is a P-martingale, therefore the relations 
w=l1 


x =Efy and 
MAS n = E( fn | Fn) - E( fn |\Fn-1) (2) 
must be satisfied. 

Hence, to obtain the representation (1) we can set x = Efy and then show (as 
for d = 1) that using condition g) we can construct ¥,_j-measurable functions Yn 
with required property (2). (See [251] for greater detail.) 

We now turn to the implication a) + g) => e). By g) the o-algebra Fy is 
purely atomic. Hence all ¥j-measurable random variables can take only finitely 
many values and are therefore bounded. 

Let P’ € Pioc(P) and let M = (Mn, ¥n,P’)ncn be a martingale. By (a) there 


N z 
exist r € R and a predictable process y = (yn) such that My = 2+ J` yAS;. 
i=l 
n Z 
The sequence M’ = (Mj, Fn, P')ngn of variables M}, = «+ 35 AS; is a local 


i=l 
P’-martingale and, therefore, a martingale because all its elements are bounded. 
Since My = My, the martingales M and M’ coincide (P’-a.s.), which gives us 
assertion e). 
Finally, to prove the implication e)== a) it suffices to make the following obser- 
vation (cf. Lemma § 4a). 
Assume that each martingale M = (Mn, Fn, P’) with P’ € W,(P) admits an 
n = 
‘S-representation’ Mn = Mo + >> yiâ Si. 
i=1 
Let fy be a Fy-measurable bounded function. We consider the martingale 


Mn = E'(fn | Fn), n < N, where E’ is averaging with respect to P’. By assumption, 


N 
fy = Mn =Mo+)>_%AS; (P'-a.s.). 
wl 
Hence we can represent fy as a sum: 
N 
fn=art+ So nds: (P-a.s.), 
t=] 


where z = Mo and y = (%)icn is a predictable sequence, which means precisely 
that the market is complete. 

These arguments complete our discussion of the above-described implications 
required in Theorem B*. 
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2. We present now several examples illustrating both Theorem B* and Theorem A*. 


EXAMPLE 1 (d= 1). In the framework of the CRR-model (see § 4d) with Bn = 1, 
n < N, we assume that (pn)n<n is a sequence of independent identically distributed 
random variables taking two values, a and b, a < b. 

Since AS, = Sp—1pn, it follows that AS; = S,~1a or AS, = S,_1b. By 
Theorem A‘, in order that the market be arbitrage-free the interval (a,b) must 
contain the origin. Hence a < 0 < b. To ensure that the prices S are positive we 
must also set a > —1. 

In this case AS, = Sn—1Pn, so that AS, can take two values: S,—1b (‘upward 
price motion’) and Sp—1a (‘downward motion’). Hence the supports of the condi- 
tional distributions Q,(-;w) consist of two points: S,_1(w)a and S,—1(w)b, while 
the ‘price tree’ (Sg,5S1,S2,...) (see Fig. 56) has the ‘homogeneous Markov struc- 
ture’: if (So, S1,-..,Sn—1) is some realization of the price process, then the next 
transition brings Sn = Sn-1B with probability p = P(pn = b) and Sn = Sp_1A 
with probability q = P(pn = a), where B =1 +b and A=1+4a. 


Sn 


SoB? 


So 


So A? 


— + > 
1 2 
FIGURE 56. ‘Price tree’ (So, S1, S2,...) in the CRR-model 


If —1 <a <0 < b, then there exists a unique martingale measure, so that the 
corresponding (B,S)-:market is arbitrage-free and cornplete. 

By Theorem B* we obtain that for d = 1 all complete arbitrage-free markets 
have fairly similar ‘dyadic’ branching structure of the prices. 

Namely, for fixed ‘history’ (S9,51,...,Sn—1) we have Sn = Sn-1(1 + pn), 
where the variables pp = pn(So,S1,..-,Sn—-1) can take only two values, 
an = an(So, Si, ...3 Sn-1) and bn = bn (S0, Sis sik Sni): 

In the above model of Cox-Ross-Rubinstein the variables a, and 6, are 
constant (an = a and bn = b). In general, they depend on the price history, but 
again, to obtain a complete arbitrage-free market with positive prices, we must set 
-~l<an<0< dp. 
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EXAMPLE 2 (d = 2, N = 1). Let Bp = By = 1 and let S= (S1, 8?) be the prices 
of two kinds of stock, where Sd = Se = 2. We consider a single-step model (N = 1); 


let a 
= 1 
Sais (ast) 


be the vector of price increments with Ast = st — S$ = si -—2,7= 1,2. 
In accordance with Theorem B*, to make the corresponding arbitrage-free mar- 
ket complete we need that the support of the measure P(AS} € -) consist of three 


points in the plane, 
a2 a3 
bg j’ b3 jJ’ 


and the three corresponding vectors in R? must be affinely independent. As already 


; nee ; : : : a, — 
mentioned, this is equivalent to the linear independence of the vectors ( a 


bi — b3 
a2 ~ a3 
and ( isi ), 
For instance, assurne that the probability of each of the vectors 

1 -1 0 

1}? 1}? -l1 
is equal to $. These vectors are affinely independent, and the measure assigning 
them probabilities i i and 5 respectively, is martingale. 
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1. European Hedge Pricing on Arbitrage-Free Markets 


§1a. Risks and Their Reduction 


1. H. Markowitz’s theory ({332], 1952), as expressed in his mean-variance analy- 
sis (see Chapter I, §2b), suggests an approach to the pricing of investment risks 
and the reduction of their nonsystematzc component that is based on the idea of 
diversification in the selection of an (optimal) investment portfolio. 

Several other optimization problems arising in financial theory can, in view of 
the ‘environmental uncertainties’, be (as in H. Markowitz’s case) ranked among 
the problems of stochastic optimization. In the first line, it should be mentioned 
that finance brings forward a series of nontraditional, nonstandard optimization 
problems relating to hedging (see Chapter V, §1b about the notion of ‘hedging’). 
They are nonstandard in the following sense: optimal hedging as a control must 
deliver certain properties with probability one, rather than, e.g., on the average, as 
is usual in stochastic optimization. (As regards problems involving the mean square 
criterion, see § 1d below.) 

In what follows we put an accent on the discussion of hedging as a method of 
dynamical control of an investinent portfolio. It should be noted that this method 
is crucial for pricing such (derivative) financial instruments as, e.g., options (see 
subsections 4 and 5). We can say more: it is in connection with the pricing of 
option contracts that the importance of hedging as a protection instrument has 
been understood and its basic methods have been developed. 


2. We recall that we already encountered hedging in § 1b, Chapter 5, where in the 
simplest case of the single-step model we deduced formulas for both initial capital 
sufficient for the desired result and optimal (hedging) portfolio itself. 

Explicit formulas for hedges are of considerable interest also in several-step prob- 
lems, in which an investor seeks capital levels at a certain (fixed in advance) instant 
N that, with probability 1 (or, more generally, with certain positive probability), 
are not lower than the values at these instants of some fixed (random, in general) 
performance functional. 
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Problems of this kind have a direct relation to the pricing of European options; 
this connection is based on the following remarkably simple and seminal idea of 
F. Black and M. Scholes [44] and R. Merton [345]: in complete arbitrage-free mar- 
kets 


the dynamics of option prices must be replicated by the dynamics of the 
value of the optimal hedging strategy in the corresponding investment 
problem. 


In the case of American options, besides the hedging of the writer, which realizes 
his ‘control’ strategy, we have one new ‘optimization element’. 

In fact, the buyer of a European option is inert: he is not engaged in financial 
activity and waits for the maturity date N of the option. American-type options are 
another matter. Here the buyer is an acting character in the trade: the contract 
allows him to choose the instant of exercising the option on his own (of course, 
within certain specified limits). 

In selecting the corresponding hedging portfolio the writer must keep in mind 
the buyer’s freedom to exercise the option at any time. Clearly, the contradicting 
interests of the writer and the buyer give rise to optimization problems of the 
minimax nature. 

The present section (§§ la~ld) is devoted to European hedging. This name is 
inspired by an analogy with European options and means that we are discussing 
hedging against claims due at some moment that was fized in advance. We consider 
American hedging in the next section (see, in particular, § 2c for the corresponding 
definitions). 


3. As already mentioned, the buyer’s control in American options reduces to the 
choice of the instant of exercising the contract, or, as it is often called, the stopping 
time. 

Here, if, e.g., f = (fo, fi,---, fN) is a system of pay-off functions (f; = fi(w), 
i =0,1,..., N) and the buyer chooses a stopping time T = T(w), then his returns 
are fr = friw (e). 


We represent now fr as follows: 


T N 
fr= fot >. Af = fo + YUIK STA fr (1) 


k=1 k=1 


Note that the event {k < T} belongs to ¥,_,. Hence (1) can be rewritten as 
follows: 


N 
fr = fo+ X on Afr, (2) 
k=l 


where a, = I(k < T) is a Fg—ı-measurable random variable. 
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We can express this otherwise: the terms of an American option contract 
allow the buyer to exert only a predictable control a = (ag),<n of the form 
ak =1(k <1). 

In principle, one could imagine contracts allowing other forms of (predictable) 
control œ = (@%)k<n from the buyer’s side. One possible example is a ‘Passport 
option’ (see, e.g., [6]) with pay-off function 


fu(@) = pp TON (3) 


where |ag| < 1 and S = (Sk)kgn is the stock price. 

It is clear in this case that the writer of the contract must select a (hedging) 
portfolio x = m(a(w),w) such that for each admissible buyer’s control a = a(w) the 
value Xj, (a(w),w) at the terminal date N is at least fy(a(w)) (P-a.s.). 


4. Hedging (by an investor or the writer of an option) and the control exercised by 
the option buyer, say, are the two main ‘optimization’ components that should be 
reckoned with in derivatives pricing on complete markets. 

For incomplete markets, we must add also the third component determined by 
‘natural factors’. 

This means the following. There can exist only one martingale measure in 
complete arbitrage-free markets. However, in incomplete arbitrage-free financial 
markets ‘Nature’ provides one with a whole range of martingale measures and, 
therefore, with distinct versions of the formalization of the market. 

Which measures and formalizations are in fact operational remains unknown to 
the writer and the buyer of an option. Hence both writer and buyer, unless they 
have some additional arguments, must plan their strategy (of hedging, or the choice 
of the stopping time, etc.) bearing in mind the ‘most unfavorable combination of 
natural factors possible’. 

We shall express this formally by considering in many relations the least up- 
per bounds over the class of all martingale measures that can describe potential 
‘environment’. 


§1b. Main Hedge Pricing Formula. Complete Markets 


1. We shall consider a complete arbitrage-free (B,S)-market with N < œ and 
d < œœ (see the general scheme in Chapter V, § 2b). By assertion (f) of the extended 
version of the Second fundamental theorem (Chapter V, § 2e), such a discrete-time 
market is also discrete with respect to the phase variable, and all the ¥j-measurable 
random variables under consideration can take only finitely many values because 
the c-algebra Fy consists of at most (d+ 1)% atoms. Thus, there can be no 
problems with integration in this case, and the concepts of complete market and 
perfect market are equivalent. 
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DEFINITION. The price of perfect European hedging of an (F y-measurable contin- 
gent claim fy) is the quantity 


C(fn;P) = inf{a: 3n with XJ =a with XX = fy (P-as.)} (1) 


(cf. Chapter V, § 1b). 


Since the market in question is arbitrage-free and complete by assumption, 
1) there exists a martingale measure P equivalent to P such that the sequence 


(2) is a martingale (the First fundamental theorem) 
Bn non 


and 


2) this measure is unique, and each contingent claim fy can be replicated, 
ie., there exzsts a (‘perfect’) hedge 7 such that XẸ = fw (the Second 
fundamental theorem). 


Hence if 7 is a perfect (x. fy )-hedge, i.e.. Xf = x and XẸ = fy (P-a.s.), then 
(see formula (18) in Chapter V, § la) 


fn XN c Me (2) 
T EGA AE); 2 
By By By * 2. Ba (2) 
and therefore 
gin L2 
By Bo’ 
i.e.. 
= fn 
x= BoE. 3 
oF RN (3) 


We note that the right-hand side of (3) is independent of the structure of the 
(x, fy)-hedge x in question. In other words. if 7’ is another hedge, then the initial 
prices x and z’ are the same. 

Hence. we have the following result. 


THEOREM 1 (Main formula of the price of perfect European hedging in complete 
markets). The price C( fx: P) of perfect hedging in a complete arbitrage-free market 
is described by the formula 


C(fn; P) = BoE. (4) 
N 
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2. In hedging one must know not only the price C(fy; P), but also the composition 
of the portfolio bringing about a perfect hedge. A standard method here is as follows 
(cf. Chapter V, § 4a). 


We construct the martingale M = (Mn, Fn, P)ncn with Mn = E( | $n). 
N 
Since our market is complete, it follows by the Second fundamental theorem (or the 


S 
lemma in Chapter V, § 4b) that M has an ‘p Tepresentation’ 


My =Mo~ X na( 3) (5) 


k=1 


with #,_1-measurable yp. 
S 
We set 1* = (0*,y*), where 7* is equal to y = (Yn) in (5) and G* = Mn- m r, 


n 
It is easy to verify that this is a self-financing portfolio (see nevertheless the proof 
of the lemma in Chapter V, § 4b). Further, by construction, 


Xg 
Bo 


XEEN -aopn SAN nN 
a(ž5) = nal $ = nd (FE) = aM, 


Hence, for all 0 <n < N we have 


= Mo (6) 


and 


xz ie Ín 
T amn =E | I), m 


and, in particular, 
Xt = fy (P- and P-a.s.). 


; ; S PEET 
Thus, the portfolio x* constructed on the basis of the ‘a Tepresentation’ is a perfect 
hedge (for fy). 
We can sum up the above results as follows. 


THEOREM 2 (‘Main formulas for a perfect hedge and its value’). In an arbitrary 
arbitrage-free complete market there exists a self-financing perfect hedge 
n* = (6*,y*) with initial capital 


XG =C(fn;P) (- Bo eft), 
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that replicates fy faithfully: 
xy = fy (P-a.s.). 


The dynamics of the capital XZ” is described by the formulas 


xq’ = Ba E( E | 94), O<n<N: 
N 


F ae) ; 
the components y* = (7;,) are the same as in the ‘ p representation’ 


e(& |an) -64x yo a(S), lon, 
and the components p* = (8%) can be defined from the condition 
XZ” = O Bn ae Sa. 


3. We consider now the issue of hedge pricing in a somewhat more general 
framework, assuming that, in place of a single function fy, we have a sequence 
fo, fi.-.-. fn of pay-off functions and f; is Fi-measurable, 0 <i< N. 

Let r = T(w) be a fired Markov time ranging in the set {0,1,..., N} and let fy 
be the terminal pay-off function constructed for this r and fo, f1,...,fN- 


THEOREM 3. If an arbitrage-free (B,S)-market is N-complete, then it is also 
T-complete. that is, there exist a self-financing portfolio m and an initial capital 
x such that Xf =x and Xf = f, (P-a.s.). 


Proof. This is simple: we construct a new pay-off function fx, = fran: then the 
perfect hedge x* for fy, is simultaneously a perfect hedge for the initial pay-off 
function fr. 

The corresponding price 


C(fr;P) = min{x: 3r with Xf =z and X7 = fr (P-a.s.)} 


can be evaluated by the formula 


1. European Hedge Pricing on Arbitrage-Free Markets 509 


4. One can put the following question in connection with ‘main formula’ (4). 
Assume that we consider a complete and arbitrage-free (B,S)-market and let 
P be a martingale measure for the discounted prices =. It will be useful to 
formulate the last property in the following equivalent way: the vector process 
B S! St\ f, S! SENN) z ; 
SS ee eS ie., the process | 1,—~,..., = is a P-martingale. 
B B B B B 


We assume now that there exists another (positive) discounting process 
B = (Bn)ngn and a measure P equivalent to the initial measure P such that the 


discounted process : 
(5-5) 
BB’ B 
is a P-martingale. 


One would expect the value of the price C( fy; P) defined by (1) to be indepen- 
dent of the choice of the corresponding pairs (B, P) and (B, P). 
In fact, this is just the case: we have the equalities 


BEEN ~ BEL, (9) 


moreover, even the ‘price processes’ 


KER BE ( 2 
d c a EE a ig, 0 


are the same. 
We now discuss the factors underlying this coincidence. To this end we assume 


—B 
that ES = 1 (which is not a serious constraint). Then we can introduce a new 
N 
measure P (on Fy) by setting 
dP = Zy dP, 
es a a | =~- g 
where Zn = Zn =, Zn SS Set Pr, = (P| Fn), and Pr = (P| Fp), ne N. 
Bn dPp 

The measure P is a probability measure, and by Bayes’s formula (see the lemma 

in Chapter V, § 3a) 


(S|) = hel fea 


li 
NI 
3 
bales: 
m? 
DETEN 
w 
2 
N 
2 
Y 
3 
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| 
Di 
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because e( = | Fx) = Sn 
ByN Bn 


Hence the sequence (=) is a martingale both with respect to P and with 
n<N 
respect to P. 

However, if the market in question is complete, then the pa measure 
must be unique, and therefore P = P, that is, Zn = 1 for n < N, which, by the 
definition of the Zp brings us to the equalities 


Zn = | ==) n<QN, (11) 


showing that 


Ba E( E | #) = Za 8( 225 | 3a) = Bn e( & |F). 


N 


This proves (9) and (10), so that. indeed, the value of the price C(fy;P) ina 
complete market is independent of our choice of a discounting process. In Chap- 
ter VII. §1b we shall consider discounting in the continuous-time case (and in 
greater detail). At this point, however, we mention only that a successful choice of 
a discounting process can in many cases considerably reduce the analytic complex- 
ity of finding the prices C(fy;P) and the corresponding perfect hedges. See, for 
instance. the calculations for ‘Russian options’ in §5d and Chapter VIII, § 2c. 


5. Thus. formula (4) fully solves on a complete arbitrage-free market the pricing 
problem for perfect hedges in the case of the discounting process B. If P is the 


‘ . S. ; : ~ 
corresponding martingale measure ( ie., B is a martingale with respect to P), 


then a transition to another discounting process B changes also the martingale 
measure: the new measure P, in accordance with (11), can be obtained by the 
formula 


dP = —N ap. (12) 


On the other hand. if we are in an incomplete market, then there exist several 
martingale measures and it is no longer easy to understand what we must call 
the hedging price. If we have two distinct martingale measures, P; and Pg, and 
therefore can give distinct definitions of what it means for a market to be arbitrage- 


free. then the quantities Bo Es IN. an By Eş In are, in general, also distinct 
1B 2? BN 


N 
(see § lc below). 
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6. For an illustration of our above discussion of various discounting processes and 
the calculation of conditional expectations with respect to different measures we 
consider the following example. 

Let fy be the price of some contingent claim (expressed in USD). If we con- 
sider a complete arbitrage-free (dollar) market, then the price of a perfect hedge is 


Bo ELN, where B = (Banen is a (dollar) bank account. 
N 
Next, we consider a market with prices in German marks (DEM), Expressed in 


marks, the value of the same contingent claim becomes fy Syn (DEM). where 


DEM 
Sy = | = 
USD / yN 
is the cross rate at time N. 


If B = (Bn)n<n is a DEM-bank account and the corresponding market is 
complete and arbitrage-free. then the price of a hedge against the contingent claim 
fnSw (Gin DEM) is 


Bo ESN fn ; 
ByN 


which, converted into dollars. makes up 


ESNÍN. 
N 


Sg Bo ES 


What are the conditions ensuring that. as one would anticipate. this dollar price 


is equal to Bo EAN or. in a more general form. 
N 


Sa Bn E( “34 Ea = 3,e( 2 | Fa)? (13) 
| Bn 


By 
We assume that the DEM-market is arbitrage-free: then for the cross rate 


DEM Sn 5 ; 
S= ; Sn = | = ) . the 5 =e t be a P-martingale. 
(Sn)neN: Sn E ). e process (F) must be a P-martingale 


= P, 
Hence E( =e | Fa) = Sn and if Zn = arn then. by Bayes’s formula. 
Bn B P 


n dPy, 


Hence 
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n 


If the USD-market is also arbitrage-free, then (=) is a P-martingale, 
n<gNn 


Bn 
Sn =E(=¥ | Fn). (16) 


By (15), (16), and our assumption that the USD-market is complete, and therefore 


so that 


the measure P is unique, we obtain 


— ee ai 1 17 
Bn dP oo 
(cf. (12)), and for all n < N, 
i 
Bie Me 5, (18) 
Bn dP, 


~ . dP. : 
Since the process Z = (Zn, Fn, P)ncn with Zn = ——" is a martingale, equal- 
n 


n 
that this property would ensure the coincidence of the (dollar) prices of hedging 


against fy in the USD and the DEM-market before any calculations: we could 
regard B = (Bn)ngn as one of the basic securities on the dollar market with dollar 


B ; = : 
ity (18) ineans that & is also a P-martingale. In fact one could foresee 
ngN 


bank account B = (Bn)n<n- 


§1c. Main Hedge Pricing Formula. Incomplete Markets 


1. As shown in the preceding section, the price C(fyy;P) of perfect hedging on a 
complete arbitrage-free market has the following expression: 


C(fn; P) = BELY, (1) 


where E is averaging with respect to the (unique) martingale measure P such that 
S ~ . 
— is a P-martingale. 

A similar question of hedging prices can be put, of course, also on an incomplete 
market. However, there does not necessarily exist a perfect self-financing hedge on 
such a market, therefore we must modify our definition of the hedging price and 
consider a somewhat wider class than self-financing strategies to which we could 
stick in the case of a complete market. 

We recall that the value X” of a self-financing strategy 7 = (G,-y) on a complete 
market can in fact be defined in two ways: we either write 


xp = Bn Bn T YnSn (2) 
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with Bp, AGB, + Sp-1A%n = 0 or 


n 
Xn = XF + X (BkABp + WAS) (3) 
k=1 


(see Chapter V, § la for greater detail). 

Representation (3) is more convenient in a certain sense for it visualizes the 
dynamics of the growth of capital: Xğ is the share of the initial capital in X5, 
while 

AX? = Br ABn + nASn (4) 


is the increment. 

In our case of hedging on an incomplete market it seems reasonable to consider, 
alongside the portfolio t = (3,7), also the consumption process C = (Cn)nz0, which 
is a non-negative non-decreasing process with ¥,-measurable Cn and Co = 0. 

In fact, we already discussed this situation in Chapter V, § la, where we called it 
the case with ‘consumption’. We assumed there that, in place of (4), the dynamics 
of the capital X™° corresponding to a portfolio m and consumption C can be 
described by the relations 


AX®S = Br ABn + mASn — ACh; (5) 


where AB, + Yn Sn is the share of the total increment that can be ascribed to the 
composition of the portfolio and is due to the ‘market-related’ changes AB, and 
ASn, while ACn characterizes the expenses on consumption (e.g., the expenditures 
on the changes in the portfolio). 

Thus, we shall now assume that the value X™C of the strategy (x, C) can (by 
analogy with (3)) be calculated by the formulas 


nm 
KEE = XP +S (ABk +WASk) -Cn 2 31, (6) 
k=1 


which is equivalent to 


XTC S AC. 
A{— } = maf 2) a, mad 
( Bn $ Bn By-1 
AC; 
Remark 1. Setting bp = k — ce we see from (6) that 
k 


n 
C 
p lade, Caade 5 (BA Bp + YkASp). 
k=1 
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This formula is very much similar to (3). However, while 6, in (3) is Fk—1- 
measurable, 8, is now only ¥;,-measurable. 


Remark 2. We have already mentioned that, generally speaking, perfect hedging is 
unattainable in incomplete markets, i.e., there does not necessarily exist hedging 
m = (3,7) such that Xf, = fn (P-a.s.) At the same time, this does not rule out 
the possibility that modifying our definition of an admissible strategy we can attain 
the level of terminal capital offsetting (P-a.s.) the pay-off fy. As will be clear from 
the proof of the theorem below, our calling upon ‘consumption’ enables us to find 
a strategy (x, C) such that XES = fn (P-a.s.). This is one ‘technical’ argument in 
favor of considering consumption C alongside the portfolio m. On the other hand, 
our introduction of strategies with ‘consumption’, which must satisfy constraints of 
the form AC’, > 0, has clear economic implications. 


2. DEFINITION. The upper price of European hedging (against a #j-measurable 
pay-off fy) is the quantity 


C*(fv;P) =inf{x: 3(r,C) with XFC = x and XV° > fy (P-as.)}. (7) 


Remark. Besides the upper price we can also consider the lower price of hedging 
(see the definition in § 1b). In our subsequent discussions we shall consider only the 
upper price and call it simply the price of hedging. 

Let (P) be the set of all martingale measures P equivalent to P. We assume 
that P(P) 4 Ø. 

The central result of pricing theory on incomplete arbitrage-free markets, which 
generalizes formula (1), is as follows. 


THEOREM 1 (Main formula of the price of European hedging on incomplete mar- 
kets). Let fy be a non-negative bounded ¥y-measurable function. Then, on an 
incomplete arbitrage-free market, the price C*(fnN;P) can be calculated by the 
formula 


C*(fv;P)=_ sup BoEg =~ (8) 


where Es is averaging with respect to measure P. 


We have already encountered a special case of this result (Theorem 1 in Chap- 
ter V, § 1c; see also {93]) in the case of a single-step model. 

The crucial point in the proof of (8) is the so-called optional decomposition 
(see § 2d below), which has a rather technical proof. The existence of this optional 
decomposition and formula (8) were established for the first time by N. El Karoui 
and M. Quenez [136] and D. O. Kramkov [281]; as regards generalizations and other 
proofs, see also [99], [163], and [164]. 
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3. Proof of the theorem. Let (z,C) be an (z, fy)-hedge, i.e., assume that 
Bai = z and XDS > fn (P-a.s.). 
Then (cf. formula (2) in § 1b) 


xm? N S. N AG 
p< Bc 2 naa) -oa 
N Bn Bbo ae Bk A Bre. 
z N S 
<> A Skei 
ip * 2 (3) (9) 


BoEs——— <S zT, (10) 
N 


N 5 
because EB Y wA (=) = 0, which is a consequence of assertion 2) of the lemma 
k=1 k 


N 
in Chapter II, § 1c and the inequality WA Sk 2 a following by (9). 
8 
k=1 By Bo 
Hence 
sup Bo Es ie < C* (fy; P). (11) 
Pe PP) y 
To prove the reverse inequality we set 
Yn = ess sup Es (7 | Fn), (12) 


PeP(P) 


where, by definition, the essential supremum Y,, is the ¥,,-measurable random 
variable that, on the one hand, satisfies the inequality 


Yn > es( | Fa) (P-a.s.) (13) 


for each measure P € 2(P), and, on the other hand, has the following property 
(‘minimality’): if Yn is another variable dominating the right-hand side of (13), 
then Yn < Yn (P-a.s.). 

As shown in § 2b, the sequence Y = (Yn, ¥n)ngn is a supermartingale with 
respect to an arbitrary (!) measure Q € 2(P), i.e., 


EQ@(¥n41 | Fn) S Yn (Q-a.s.). (14) 
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Recall that, as follows from the classical Doob pec tan (§ lb, Chapter IT), for 
each particular measure Q we can find a martingale M&= (MÌ ; A Q)o<n<N: where 
MÅ = = 0, and a predictable non-decreasing process AQ = (AÌ, Fn- 1:Q)icn<N: 
where AQ — = 0, such that 

Yn = Yo + MÅ ~ Ad. (15) 

It is remarkable that if Y = (Yn, Fn) is a supermartingale with respect to each 
measure Q in the family A(P), then Y has a universal (i.e., independent of Q) 
decomposition pot 

Yn = Yo+ Mn — Cn, (16) 
where M = (Mn, Fn) is a martingale with respect to each measure Q € 2(P) 
and C = (Cy, Fn) is a non-decreasing process with Co = 0. 

We point out that while the process AQ in the Doob decomposition (15) is 
predictable (i.e., the Ag are Fy, 1-measurable), the process C = (Cy, Fn) in (16) 
is only optional (i.e., the Cn are ¥,-measurable). This explains why (16) is called 
the optional decomposition. 

In the special case of the supermartingale Y = (Yn, Fn) defined by (12) we can 
further specify the structure of the martingale M = (Mn, Fn): 


n 

e _ Sk 

Mie Daaf), (17) 
k=1 Br 


where F7 = (Yn, Fn-1) is a predictable process. (We emphasize that this is far 
from trivial and can be established simultaneously with the proof of the optional 
decomposition; see § 2d.) 

For the processes 7, C, and Yo introduced by (16) and (17) we shall now con- 
struct the portfolio 7 = (B, y) and a consumption process Č such that for the 


corresponding capital we have x = Bo sup Ep and XE Č > fn. Of 
Pea(p) ~N 
course, this means that 
C*(fv;P) < XC = Bo sup Egle, 
PeV(P) N, 


which, together with (11), brings us to (8). 
The required portfolio 7 = (8:7) and the consumption process C can be defined 

as follows: 

~ Sn 

NBr > 


n 
Cn = X Bk-1 ACh, (19) 
k==1 


Yn = Vn Bn = Yn T (18) 


where 7 and C are as in the optional decomposition of the supermartingale Y. 
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For % and C so defined the starting capital is 
B.C _% ~ 
Xo = BoBo + YoSo = Yo Bo. 


In the scheme with consumption we assume (see Chapter V, § la.4) that the incre- 
ment in the capital can be described by the formula 


AXEe = Bn ABn + Yn AS), a NOx: (20) 


which, as already mentioned, yields 


x®O\ 7g) AĞ, 
= bey es 21 
af Bn ) ha(g Bn-1 ( ) 
(cf. also (27) in Chapter V, § la). 
By (16)--(19), 
xŽČ 
= 2 
a(22°) -an, e) 
x, 
and since —2— = Yp, it follows that 
Bo 
x%Č f 
NS =N 
Be ae: (23) 


Thus, ae = fy, so that the proposed strategy (#,C) with starting capital 


XES = BoYo = Bo sup Ep i 
Pe AP) N 
gives one a perfect hedge: XES = fn. 
Hence i 
C*(fniP)< Bo sup Exs—. 
(Ini P) < Bo sup Ep3 


PeP(P) 


Together with (11) this proves required formula (8) (provided that we have the 
optional decomposition). 

Thus, we have proved Theorem 1 and, incidentally, established the following 
result (cf. Theorem 2 in § 1b). 
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THEOREM 2 (Main formulas for a perfect hedge, its value, and consumption). On 
an arbitrage-free market it is possible to find a self-financing hedge x* = (G*,y*) 
and consumption C* such that the value of this hedge, As = by Bn+y77,Sn, changes 
in accordance with the balance condition AXZ” = BXABy + yZASpn — AC* and 
satisfies the relations 


Xf = C*(fy;P) (- sup Bo Epit) 


and 
XN = fn (P-a.s.). 
The value XZ” of this hedge can be calculated by the formula 
p Co = By ess sup Es( # | Fa): 
Pe PP) 


the components y* = (yž) and C* = (Cx) can be found from the optional decom- 
position 


fn ) fn, maa ( Se “ AC} 
ess sup (2 Fn) = sup Es + SoA — |- 5 ‘ 
Pe PP) P\ By | Pe X(P) P By k=1 Br eon Pk-1 


and the components 3* = (8%) can be found from the condition XZ” = 8% By t+y*Sn. 


§1d. Hedge Pricing on the Basis of the Mean Square Criterion 


1. Let fy = fn(w) be a Fy-measurable pay-off. In a complete arbitrage-free 
market there exist a starting capital z and a strategy m such that a trader can 
replicate fy faithfully (in the sense that XẸ (z) = fy with probability one, where 
XN (2x) is the value Xy of the strategy a such that Xf = z). 

However, if a market (with or without arbitrage) is incomplete, then the situation 
becomes considerably more complex and one cannot hope to replicate fy faithfully. 

In § 1c we considered the calculations of the hedging price C*(fy;P) on an 
arbitrage-free market when the hedging strategy x was to ensure that 


XN(C*(fn;P)) > fn  (P-a.s.). 


In the present section we shall understand optimal hedging in a somewhat dif 
ferent sense, as the replication of fy with ‘maximum precision’. 

Our choice of the measure of accuracy is, in a sense, a matter of convention; it 
depends on one’s aims, the chances of finding a precise solution of the corresponding 
optimization problem, and so on. 
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In what follows, we measure the quality of replication by the mean square devi- 
ation 


Ry (7,2) = E[XẸ (2) — fl’, (1) 


which in many cases enables one to find the ‘optimal’ components z* and 7* bring- 


ing about the minimum of E[XẸ (z) — fini: 
inf Ry(a;2r) = Ry (x*;2*). (2) 
(7,2) 


2. Let (Q, F,(Fn)ngn,P) be a fixed filtered probability space, let Fo = {8,9}, 
and let Fy = F. Let S = (Si. is. oS) wen be a sequence of prices of some 
d-dimensional asset and assume that E Ta <M. 

If the sequence of prices is a martingale and, moreover, a square integrable mar- 
tingale with respect to the original measure P, then the optimization problem (2) is 
easy to analyze in the class of strategies m satisfying the inequality E(Xy (x))* < œ. 
(We emphasize that we do not assume the uniqueness of the martingale measure P 
and, therefore, the completeness of the market.) 

For a strategy m = (y!,..., yf), where the 7’ = (nen are predictable vari- 
ables, let 


n n d 
@)=24 3 (maspare bs v asi) ) (3) 
k=1 k=1 ‘i=1 


be its value. 
Since the sequence (X7}(«))n<n is a martingale, it follows that 


EXN (xt) = 2. (4) 
Setting € = Xg (z) — fy we see in view of the obvious equality 


Eg? = (E£)? + E(é ~ E£)? 

that j > 
Ry (7; 2) = [E(fw — 2))” + E[(XN(2) ~ 2) ~ (fn ~ Efn)]“. (5) 
We shall show below that for each pair (7, £) with E(X%,(x))? < oo there exists 
a pair (x*, x) such that Ry (7r; x£) > Ry (x*;xr) and XT (x) — 2 is independent of z. 
Hence it follows by (5) that the smallest lower bound inf [inf Ry (7; z)| is attained 

T T 
for 

IE ESN, (6) 


For i = 1,...,d we set 


si _ Efn ASil Fn-1) 
mn = EAS Fn) (7) 
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(where we put 0/0 = 0) and consider the martingale L* = (Li)n<gn with 


N 


E; = E| in = Soot, ASe | a] ~ 2 (8) 
k=1 
Clearly, fy has the following decomposition: 
N 
fy =2+ > (i ASh) + IN. (9) 
k=1 


Using the definition (7) we can verify directly that 
E(AL*, (Yn: ASn) | Fn-1) = 0. (10) 
Note that this property is equivalent to the ‘orthogonality’ of the square integrable 
martingales (Lf )ngyn and > WAS) en in the sense that their product is 
also a inartingale. It is worth noting in this connection that (9) is called the Kuntta~ 


Watanabe decomposition in the ‘general theory of martingales’. 
By (9) and (10) we obtain that for each pair (m, £), 


Ry (asa) = [fv - (2+ Sm. A5))] 


k=1 


N 2 
=E bre: ~= Yk, ASR) + rh] 


N 2 
= aber a m A54)| + E[LA]? > EJLA]? 


k=1 
= Elin — (= + Loraso)| 
SAN ARN Y: (11) 


Moreover, the first inequality turns to an equality for y = %*. 
Thus, we have proved the following result. 


THEOREM. Assuine that the original measure P is a martingale measure. Then 
the optimal hedge x* = (7*!,...,7*2) in the problem (2) (in the sense of the mean 
square criterion) is described by formulas (7), z* = Efn, and 


Rysa") = Elfy = (2° + S (Ewas). (12) 


kal “i=1 
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Remark. If P is not a martingale measure (with respect to the prices S), then the 
question on the existence of an optimal pair (x*,2*) and its search become fairly 
complicated. See, c.g., the papers of H. Féllmer, M. Schweizer, and D. Sondermann 
[167], [168], [425], and also [195], [194] in this connection. 

We note also that the concept of a minimal martingale measure that we men- 
tioned in passing at the end of §3d, Chapter V has been developed precisely in 
connection with the above problem of hedging on the basis of the mean square test. 


§ 1e. Forward Contracts and Futures Contracts 


1. In the present section we shall show how to price forward contracts and futures 
contracts, important investment instruments used on financial markets alongside 
options. 

In accordance with the definitions in Chapter I, § 1c, forwards and futures are 
sale contracts for some asset that must be delivered at a specified instant in the 
future at a specified price (the ‘forward’ or the ‘futures’ price, respectively). 

There exists an essential distinction between futures and forwards, although 
both are sale contracts. 

Forwards are in fact mere sale agreements between the two parties concerned, 
with no intermediaries. 

Futures are also sale agreements, but they are concluded at an exchange and 
involve a clearing house through which the payments are made and which is the 
guarantor of the contract. 


2. Assume that the inarket price of the asset in question can be described by a 
stochastic sequence S = (S_)gcn, where N is the maturity date of the contract, 
which we identify with the instant of delivery. 

Clearly, if the deal is struck at time N, when the market price of the asset 
is Sy, then for any reasonable definition of the forward or the futures price it must 
be equal to Sy. This becomes another matter, of course, if the contract is sold at 
tine n < N; the erucial question here is how we must understand the fair price of 
the contract (on au arbitrage-free market). 

For a formalization, let us consider the scherne of a (B, S)-market described in 
Chapter V, § la, where B = (Bn) is a bauk account and S = (Sn) is the traded 
asset. (If the asset in question is in fact one component of a d-dimensional vector 
of risk assets, then, in the arbitrage-free case, all our conclusion remain valid). 

We shall consider the case with dividends discussed already in § 1a.4, when the 
value X = (X7)n<en of the buyer’s strategy 7 = (8, y) is described by the formula 


Xn = BrBn + MADn, (1) 
and its changes are described as 


AX? = Bn ABn +m ADn, (2) 
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where yn is the number of ‘units’ of the asset bought and D = (Dn, Fn)ngN, 
Do = 0, is the (not necessarily positive) process of overall dividends connected to the 
asset S. 

We describe now the structure of dividends in the above cases of a forward or a 
futures contract and find the ‘fair’ prices of these contracts. 


3. Assume that a forward contract is sold at time n and that both parties agree 
(on the basis of the ‘information’ Fn) that the (forward) delivery price of the asset 
will be F,(N). 

Then, by the very mechanics of forward contracts, the overall dividends (positive 
or negative) can be represented as follows: 


Dp=0, n<k<N, (3) 
and 
Dn = Sn — En (N). (4) 
By (1) and (2) (see also (24) in Chapter V, § la) we obtain 
XT AD 
Af LE | a+, 2k 
(a W Bee ©) 
and therefore 
XR Xi ADs 
—= = N. 
By ae Son ae (6) 
k=n+1 


Clearly, for a forward contract sold at time n we can set yẹ = 0 for k < n and 
Yk = Yn+1 for all k > n +1, where yn+1 is the number of units of the asset S 
requested in the contract. 

By (6) we obtain 

XR AT Sy — En (N 
N n N 

and we immediately arrive at the following conclusion. 

Assume that the (B, S)-market under consideration is arbitrage-free and com- 


n S ; 
plete. Let P be the unique martingale measure such that (#) is a martingale 
n n<gN 


with respect to it. Assume also that the ¥,-measurable prices F,(V) satisfy the 
relation é F(N) 
(a |n) =0, ns N, (8) 


i.e., let 
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Then we see from (7) that 
EN pF. on (10) 


so that the forward contract sold at time n at the price F,(N) defined by (9) is 
arbitrage-free (i.e., if X7 = 0 and P(XẸ > 0) = 1, then P(Xf, = 0) = 1: see 
Definition 2 in Chapter V, § 2a). Of course, the value of F,,(N) (the forward price) 
can be regarded as the fair price of the forward contract. 

Note that our assumption about an arbitrage-free (B,S)-market gives us the 


equality Š 
N = 
(35 | 99) = ae 


Thus, we see from (9) that the arbitrage-free forward prices F,(N) can be defined 
as follows: 


Sn 
Bn , 
Es ( a | Fa) 
4. We now proceed to futures contracts. Assume that we sell a contract of this 
kind at time n, at an Fn-measurable price Pn(N) (the futures price). Immediately, 
the mechanism of settling through the clearing house comes into play. This can be 
roughly (skipping the issue of the margin account, the amounts deposited into it, 
and so on) described as follows in terms of (positive or negative) dividends. 

If the market futures price at time n+ 1, ®n41(NV), turns out less than ®,(N), 
then the buyer puts the amount ®,(N) ~ ®j41(N) into the seller’s account. If 
On41(N) > @n(N), then, conversely, this is the seller who deposits the amount 
®n41(N) — @n(N) into the buyer’s account. 

We shall set 69 = ®o(N) and 


bn = On(N)—O,_1(N), nÈ. 


F,(N) = n<N. (11) 


Also, let 
Dn = 69 + 61 + Ôn, (12) 
so that ADn = bn, n > 1. 
By (6) we see that (cf. (7)) 


N 
PEN xT AD 
N n k 
=p. Pieri > 3B 13) 
By Bn n+ a By ( 


Hence, similarly to forward contracts, we conclude that if Pisa unique martingale 
measure for the (B, S)-market in question, then the condition 


N 


EAS SE | Fa) =0 (14) 


k=n4+1 
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imposed on the prices ®,(N),...,®n41(N) ensures that the futures contract sold 
at time n is arbitrate-free. g 
Assume that the sequence D = (Dn)ngn is a martingale with respect to P. 
Then (14) holds for each n > 0. In fact, since the predictable variables B, are 
positive, we also have the converse result. 
The condition that D = (Dn)ncn be a martingale means that 


Es (Dw | Fn) = Dr. (15) 


However, Dn = ðo +: + ôn = On(N) and Dy = ®ny(N) = Sy. Hence if the 
futures prices are 


&n(N) =Es(Sv|Fn), 2 <N, (16) 


then the corresponding futures contracts are arbitrage-free by (15). 


Remark. Let B = (Bn)n<n be a deterministic sequence. Then, clearly, 
_ SN 5 = Sn 
$,(N) = ee |an) By = Bp BN 


n 


and comparing with (11) we obtain the well-known result that for a deterministic 
bank account B = (Bn) the forward and the futures prices are the same. 
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§2a. Optimal Stopping Problems. Supermartingale Characterization 


1. The supermartingale characterization in §1c of the sequence Y = (Yn) with 
respect to each measure in the family “(P) is not that striking if one treats the 
operation of taking the essential supremum in (12), § 1c as an optimization problem 
of finding the ‘best’ probability measure. For this interpretation, the above super- 
martingale property is a mere consequence of the well-known ‘optimality principle’ 
for the price process (the ‘Bellman function’) in a stochastic optimization problem. 

A special case of such a problem is the optimal stopping problem for a stochastic 
sequence f = (fn)n<n, which seems a reasonable starting point in our discussion 
of ‘supermartingale characterizations’ in optimization problems. The emphasis on 
this case is also justified by our discussion of American options below (the buyer of 
such an option has a right to choose the date of exercising, so that the latter can 
be regarded as an ‘optimization element’), and also because the necessity to have 
a ‘sufficiently rich’ class of objects when considering esssup is clearly visible there. 


2. Let f = (fn, Fn)ogngn be a stochastic sequence on (Q, F, (Fn)ogngn: P), 
where Fo = {@,Q} and Fy = F. We shall assume that E|f,p| < oo for all 
noNXN<ow. 
We are interested in the problems of finding 
1) the functions (prices) 
Va = sup Ef, (1) 
TEMY 


where the supremum is taken over the class MN of all stopping times T such that 
n<rT< N, and 
2) the optimal stopping time (which is well defined in our case). 


We do not formulate here the optimal stopping problem in its most general form 
(see subsection 4 below), when N can be infinite (and MẸ is the class of all finite 
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stopping times T > n); we restrict ourselves to finite ‘time horizons’. The main 
reason is that this case is relatively easy to study, but on the other hand one must 
already use backward induction, which is one of the main tools in the search of 
both prices vy and the corresponding optimal stopping times. 


3. We introduce a sequence 7% = (A ocn<en by the following (ad hoc) definition: 


aN = fn. 


2 
AN = max( fn, EC | Fn). (2) 


We also set 


N 
Tn 


=min{n<i< N: f=} 
forO<n<QN. 
The following is one of the central results of the theory of optimal stopping prob- 


lems for a finite time interval 0 < n < N (cf. [75; Chapter 3] and [441; Chapter 2]). 


THEOREM 1. The sequence y = (7!) ncn defined by recursive relations (2) and 
the stopping times TN, 0< n < N, have the following properties: 

(a) tr e my; 

(b) Effen | Fn) = YR; 

(c) E(fr| Fn) < Elfryn | Fn) = 4% for each r E€ MN, 

(d) yA = esssup E(f, | Fn) and, in particular, OTN = sup Ef; =Ef_y; 

rem TEMG f 
(e) VA = Ey. 


Proof. We drop for simplicity the superscript N in this proof and throughout sub- 
section 3 and write Yn, Vn, Mn, and Tn in place of yN, VA, MN, and 7. Prop- 
erty (a) is a consequence of the definition of Tn. Properties (b) and (c) are obvious 
for n = N. Now, we shall proceed by induction. 

Assume that we have already established these properties for n = N,N—1,...,k. 
We claim that they also hold for n = k — 1. 

Let T € Mp1 and let A € Fpk—1. We set 7 = max(r,k). Clearly, F € My and 
for A € Fk—1, bearing in mind that {r > k} € Fk—1, we obtain 


E(lafr] = E| angrer} fr] + EMAk} fr] 
= E[Lantr=k—-1} fe—1] + ELangranyE (fr | Fk-1)] 
= E[Lantr=k—1} fe—1] + E[LangrenyE(E (fe | Fx) | Fe—1)] 
= ElLantr=k—1} fe—1] + E LangrenyE (re | Fk-1)] 
E| AYk-1]; (3) 


IN 


where the last mequality is a consequence of (2). 
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Thus, E(f; | Fk—1) < Yk-1. We shall have proved required assertions (b) and (c) 
for n = k — 1 once we have shown that 


E( fr, | Fk-1) = Ye-1- (4) 


To this end we consider the chain of inequalities in (3); we claim that we actually 
have the equality signs everywhere in (3) for T = Tp_1. 

Indeed, by the definition of t,_1 we have T = Tp in the set {7,_1 > k}. Since 
E( fr, | Fk) = Yk by the inductive hypothesis, it follows that in (3) we have 


E[La fre] = E[IAnfr1=k-1}fk-1] 
+E [Anir 1>k}E (El | Fe) | Fe-1)] 
= E[Z antn k1} fk] + EHan iak E Fe] 
=E[La%e—1]) 
where the last equality follows from the definition of 7,1 as max( fk—1, E(k | Fk-1)), 
which means that ¥,_1 = fr—1 on {T,_1 = k— 1}, while the inequality fy_1 < Yk-1 
in the set {t,_1 > k — 1} means that yp_1 = E(Yk | Fk-1) there. 


Thus, we have proved (b) and (c), and therefore also assertion (d). Finally, since 
for each T € Mty we have 


Ef, <S E fr, = Eyk 


by (c), it follows that Vp = sup Ef, = Ef, = Eyx, i.e., assertion (e) holds. 
TEM, 


Remark 1. We used in the above proof the fact that if rT E€ Mg—1, then the stopping 
time T = max(r,k) belongs to My. In our case the class Mẹ has this property 
simply by definition. This means that the Mg, k < N, are ‘sufficiently rich’ classes 
in a certain sense. (See the end of subsection 1 in this connection.) 


COROLLARY 1. The sequence y = (Yn)ngn ìs a supermartingale. Moreover, y is 
the smallest supermartingale majorizing the sequence f = (fn)n<n in the following 
sense: if Y = (Yn)n<gN is also a supermartingale and Yn > fn for alln < N, then 
Yn S Yn (P-as.), n S N. 


Indeed, the fact that y = (Yn)ngn İS a supermartingale majorizing f = (fn)ngN 
follows from recursive relations (2). 
Further, it is clear that yy > fyn and, forn < N, 


Fn > max( fn, E(Fn+1 | Fn))- (5) 
Since yy = fy, it follows that yy > fy and 
Fn—1 > max(fy_1,E(Fn | FN-1)) 
> max(fny—1,E(yw | ¥n-1)) = yw-1- 


In a similar way we can show that Yn, > Yn for each n < N — 1. 
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COROLLARY 2. Corollary 1 can be formulated otherwise: if y = (Yn)ngn is a 
solution of the recursive system of equations 


Yn = max( fn, E(yn41 | Fn)), n<QN, (6) 


with yy = fn, then yn < Yn for n < N and each sequence 7 = (Fn)ngn such that 
Yn 2 fn and 
Yn 2 max( fn, E(Yn+1 | Fn)), n < N. (7) 


We clanu that among all such solutions ¥ = (Yn)ngn we can find the smallest 
solution y = (Yn)ngn with yy = fy satisfying the system of equations (6). 
We set y = fy, and let 


Fn = max( fn, EO nd | Fn)) (8) 


for n < N. Clearly, Yn > fn for all n < N and ¥ = (Fn)ngn is a supermartingale. 
Since y = (Yn)n<N has by assumption the property of minimality, it follows that 


max( fn, E(Fn+1 | Fn)) = Yn = Yn- (9) 


Hence 
inax( fn, E(Fn41 | ¥n)) = Fn 2 In = max( fn, E(Yn+1 | Fn)) 


for n < N and, for n= N, 
fn =n ZYN È ÍN- 


Consequeutly, yy = Fy = fn and, in view of (9), for all n < N we have 


Yn = Yn: 


Thus, the smallest supermartingale y = (Yn)ngN> majorizing the sequence f = 
(fn)ngn satisfies the equations (6) with yy = fn. 


COROLLARY 3. The variable 
7 =min{0 sis N: fp=7'} 
is an optimal stopping time in the class MY, Le., 


sup Ef, = Ef y (=). 


N 0 
TEMA 
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4. We now consider the possible generalizations of Theorem 1 to the case when 
N = œ. More precisely, we shall assume that M} = MP is the class of all finite 
Markov times r = r(w) such that r(w) > n, w € Q. We denote the class M3 
by St*. 

Further, let f = (fn, ¥n)nzo be a stochastic sequence defined on 
(Q, F, (Fn)nz0, P), and let 


Vš = sup Ef,, (10) 
TEMS 

yà = ess sup E (fr | Fn), (11) 
TEMI 

Tà =inf{k > n: fk = Yk} (12) 


Of course one would expect that y} = nim yA (under certain conditions) and 
00 
that one can make the limit transition in (2), which yields the following equations 
for y* = (wh): 
Yh = max( fn, E(Yn41 | Fn)). (13) 
In a similar way it would be natural if rž. defined in (12), were an optimal time in 
the class Mty in the following sense: 


Va = sup Ef, =Ef;- (14) 
TEM 
and 
Va = Eqn. (15) 


As shown in the general theory of optimal stopping times (exposed, e.g., in [75] 
and [441]), these results hold, indeed, under certain conditions (but not always!). 

Referring to the above-mentioned monographs [75] and [441] for details we 
present just one, fairly general, result in this direction. 


THEOREM 2. Let f = (fn. Fn)nzo be a stochastic sequence with Esup fp < œ. 
nm 


Then we have the following results. 
a) The sequence y* = (Yh )nz0, where 


Ya = ess sup E(f; | Fn). (16) 
TEMS, 


satisfies the recursive relations 
Yh = max (fn, E(%n41 | Fn)) (17) 


and is therefore a supermartingale majorizing the sequence f = (fn). 
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b) The sequence y* = (Yž)n>0 is the smallest supermartingale majorizing f = 


(fn)n>0- 
c) Let Ty = inf{k > n: fy = yg}. If Esup|fn| < œ and P(r, < œ) = 1, then 
n 


* 


Tà is an optimal stopping time: 
Va = sup Ef, =Ef,-, (18) 
TEMA 
Yn = esssup E(f; | Fn) = E( fr; | Fn). (19) 
TEM 
d) For each n > 0, 
ve tH (20) 


(P-a.s.) as N > oo. 


5. We have already mentioned that for a finite time horizon (N < co) the optimal 
stopping problem can be solved by backward induction, by evaluating the variables 
As Fare ap nee Re recursively. This is possible since aN = fy and the yA satisfy 
recursive relations (13). 

On the other hand, if the time horizon is infinite (N = oo), then the problem 
of finding the sequence y = (Yn)n>0 becomes more delicate: in place of an explicit 
formula describing the situation at time N we must use additional characteristics 
and properties of the prices Vn, n > 0. For example, one can sometimes seek 
the required solution of the system (17) using the observation that it must be the 
smallest of all solutions. 

The techniques of solution of optimal stopping problems are better developed 
in the Markovian case. 

For a description we assume that we have a homogeneous Markov process 
X = (tn,Fn,Pz) with discrete time n = 0,1,..., with phase space (E, B), and 
with probability measures Py on F = V Fn corresponding to each initial state 
x € E (see [126] and [441] for greater detail). 

Let T be the one-step transition operator (i.e., T f(z) = Ez f (x1) for a measur- 
able function f = f(r1) such that E,|f(xr1)| < oo for z € E, where Egz is averaging 
with respect to the measure Pz, x € E). 

Also, let g = g(x) be some -measurable function of z € FE. 

We set fn (see the above discussion) to be equal to g(£n), n > 0, and assume 
that E,[sup g7 (tn)| < œ, z € E. We also set 


s(x) = sup Ezg(z;), (21) 
rEM* 


where M* = {T: r(w) < co, w E Q} is the class of all finite Markov times. 

The optimal stopping problem for the Markov process X consists in finding the 
function s(x) and optimal stopping times 7* (satisfying the equality s(x) =Ezg(x;-), 
z € E) or c-optimal times 77 (satisfying the relation s(z)— € < Exg(X;s), z € E), 
provided that such times exist. 
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The advantages of the Markovian case are obvious: the above-considered condi- 
tional expectations Ez(- | Fn) depend on the ‘history’ Fn only through the value 
of the process at time n, ì.e., through £n. In particular, yn and yA are functions 
of £n alone. 

We present now a Markovian version of Theorems 1 and 2, which we shall use 
in what follows (in Chapter VI), e.g., in our analysis of American options. 


THEOREM 3. Let g = g(r) be a @(R)-measurable function with Egg” (tp) < œ, 
xe E,k <N, and let 
sy(x) = sup Exg(tr), (22) 
TEM 


where MV = {r:0< 7 <S N}, N >20. 


Let 
Qg(x) = max(g(x), Tg(x)) (23) 

and let 
7 = min{0 <m< N: sy—m(tm) = g(tm)}- (24) 

Then 

(a) 
sy (2) = QN g(x); (25) 

(b) 
sy (z) = max(g(r),Tsy—1(z)), (26) 


where so(x) = g(x); 
(c) the Markov time rẹ is optimal in the class MF: 


Exg(x_4) = sp (2), zE E; (27) 


(d) the sequence yY = (yÑ, Fm)m<n with aN = sn—m(tm) is a supermartin- 
gale for each N > 0. 


Proof. This result and its generalizations can be found in Chapter 2 of the mono- 
graph [441] devoted to the Markovian approach to optimal stopping problems. Of 
course, Theorem 3 is also a consequence of Theorem 1 above, with the only excep- 
tion: one must in any case study the structure of the operators Qg(xz) and their 
iterations Q"g(z). (See § 2.2 in [441] for a fuller account.) 


It follows from this theorem that the optimal stopping times in the’ class 
MY = {r:0 < T < N} have the following structure for fixed N < oo. 
Let 
DN = {2: sy-n(t)=9(z)}, O0<n0<N, (28) 


and let 
CN = E\ DN. (29) 
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In accordance with Theorem 3 the optimal time TE can be expressed as follows: 


7’ = min{0 Sn < N: rn € DN}. (30) 
In other words, DÈ, DY, wha DN = E is a sequence of stopping domains while 
0 1 N g 


Cn: ON 25, Gy = Ø ig a sequence of domains of continued observations. 
Note that 


DN cp ¢..-cDN=E 


and 
cë ach 2-2 =ø. 


Hence no observations are carried out for zo € DY and r = 0 in that case, whereas 
for xo € N one performs an observation and if zı € DY, then the observations 
are terminated. If z1 € EN: then the next observation is performed, and so on. All 
observations stop at the terminal instant N (i.e., DN = E). 


Remark 2. It follows from Theorem 1 that the qualitative picture changes little if, 
in place of the price sy (x) defined by (21), we consider the prices 


2 
sno) = sup Es [Et ~ Z eren) 1) 
TEMG k=1 
with discount and observation fees (here 0 < 8 < landc(z) > 0 for x € F; ifr = 0, 


then the expression in [>] is set to be equal to g(z)). 
Formula (25) holds good with 


Qg(2) = max(9(x), BT g(x) - e(2)), (23') 
and recursive equation (26) takes the following form: 
sn(x) = max(g(x), BTsn_1(x) — c(z)), (26’) 


where so(x) = g(x). See (441; Chapter 2] for greater detail. 


EXAMPLE 1. Let € = (€1,€2,.-.) be Bernoulli variables, 


P(e; = 1) = P(e; = ~1) a 


Let £n = 2+(e€1+---+€n), where z € E = {0,+£1,...}, and let 


sn(z) = sup E,6"'2,. 
remy 
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If 8 = 1, then Qg(x) = g(z) for g(z) = z, x € E, and we can take rẹ = 0 as an 
optimal stopping time. 

On the other hand, if 0 < 8 < 1, then for g(x) = z we see that Q"’g(z) = x for 
zr =0,1,2,... and Q’g(x) = 6"2 for £ = —1, —2,... . Hence sy (z) = max(x, 0 z) 
in this case. The optimal time is now 


TÈ = min{0 <n < N: £n € {0,1,2,...}}. 


If zn E€ {-1,-—2,...} forall 0 < n < N, then we set TÈ to be equal to N. 
0 
Note that for 0 < 6 < 1 we have 


sn(t) tat = max(0,2) as N >œ. 


6. We consider now the optimal stopping problem for infinite horizon (N = oo). 
We set 


s(x) = sup Ezg(z,), (31) 
TEM* 
where M* = {7: 0 < T(w) < co} is the class of all finite Markov tirnes. 


To state the corresponding result on the structure of the price s = s(x), z € E, 
and the optimal (or ¢-optimal) stopping times we recall the following definition. 


DEFINITION (see, e.g., [441]). We call a function f = f(z) such that Ez|f(x1)| 
<œ, x€ E, and 


f(x) > T f(z) (32) 


an excessive function for the homogeneous Markov process X = (tn, Fn, Pe)n>0; 
TEE. 

If, in addition, f(x) > g(x), then we call f = f(x) an excessive majorant of the 
function g = g(z). 


Clearly, if f = f(x) is an excessive majorant of g = g(x), then 
f(x) > max(g(x), Tf(z)). (33) 


The following theorem reveals the role of excessive inajorants in optimal stopping 
problems for homogeneous Markov processes 


X = (fn, Fn, Pz) n>0; reek. 
THEOREM 4. Let g = g(x) be a function such that E, [sup 97 (tn)| <æ, TEE. 
m 


Then 


a) the price s = s(x) is the smallest excessive majorant of g = g(x); 
9-9 
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(b) s(x) = lim Q(z) (= dim. sn(2)) and s(x) satisfies the equation 


noo 
s(x) = max(g(xr),Ts(z)) (34) 
(cf (33)); 
(c) if Ez [sup la(xn)!] < oo, x € E, then the time 
TË = inf{n > 0: s(n) < g(£n) +e} (35) 
is €-optimal for each £ > 0, i.e., 
s(x) — € < Exg(xz,), rE E; (36) 


(d) let Ez [sup la(zn)|| < œ and 


T* = inf{n > 0: s(tn) = g(tn)}, 
ie., T* = 79; if Pz(t* < œ) = 1, a € E, then T* is an optimal time: 
s(x) =Exg(t7*), te B; (37) 
(e) if E is a finite set, then T* is an optimal time. 
As regards the proof of this theorem and its applications, see [441; Chapter 2]. 
In subsection 5 we discuss its applications to pricing of American options. 


Remark 3. By analogy with (21’) we could consider also the optimal stopping prob- 
lem in the case with discounting (0 < 8 < 1) and observation fee c(z) > 0. 
We set 


T~1 
s(x) = sup Es arse -Y Beton) 3’ 
k=0 
where we take the supremum over the class 


gl 
Mig.c) = {7 Ee M*: Ez D eleg) <%, TE B), 
k=0 
and assume that g(x) > 0. 
Under these assumptions the price s(x) is the smallest (8, c)-excessive majorant 
of g(x) (see [441; Chapter 2]), i.e., the smallest function f(r) such that f(z) > g(x) 
and 


f(x) > BT f(x) — e(z). (33) 
Moreover, 
s(x) = max(g(x), BT s(x) — c(x)) (34’) 
and 
s(x) = im Q(8,0)9(2), 
where 


Q(a,c)9(2) = max(9(x), BT g(x) ~ e(x)). 


2. American Hedge Pricing on Arbitrage-Free Markets 535 


EXAMPLE 2. Let £n = x+(€1+--:+€n), where z € E = {0,+1,...} and e = (En) 
is a Bernoulli sequence from Example 1. For z € E we set 


s(z) = sup Eg(|z;| — cr), (38) 


where we consider the supremurn over all stopping times T such that Egr < oo. For 
such T we have 
Ezr? = r? + EzT, (39) 


so that 
Es(lz7| - cr) = cz? + Ez(|z+| — clei). (40) 


Hence 
s(z) = cx? + sup Exg(z7), 


where g(x) = |z| — clz|* and the supremum is taken over r such that Egt < oo. 


; PAA : Pig ? 1, 
Since g(x) attains its maximum for z = +57) it follows in the case when a. 8 

c 

an integer that 
1 
2 
s(xr) Scr —. 41 
(2) <er? + = (41) 
; 1 1 

We now set Te = inf {ns |En] = =}. If |z| < —, then, of course, \r7,.| < — 
2c 2c 2c 


and therefore i 
p > Botany = 2? + Eole AN): 


Passing to the limit as N — co (and using the monotonic convergence theorem) we 


l f ; 1 
see that Egte < (ao? < oo. Hence Te satisfies (39), which shows that if |r| < 50! 

c c 
then we actually have EgTe = -5 — T°. 


(2c)? 
Since 


1 1 1 
Ex (\rr,| CTe) = Fc e|; y2 z) = cr } 
1. ‘ ; A ae 1 

for |z| < Fe it follows by (41) that Te is an optimal stopping time (for |z| < =) 
c c 


§ 2b. Complete and Incomplete Markets. 
Supermartingale Characterization of Hedging Prices 


1. We now return to the proof of formula (8) for hedge prices on incomplete markets 
exposed in § Ic. 

As mentioned there, this proof is based on the following two facts: the super- 
martingale property of the sequence Y = (Yn)ngNn with respect to each measure in 
the family P(P) and the optional decomposition for Y = (Yn)n<n- 
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In the present section we consider the supermartingale property not only for the 
sequence Y = (Yn)n<n defined by formulas (12) in § 1c, but also for a more general 
sequence defined by formula (1) below, which makes it possible to study American 
hedging (see the remark in § la). 

The optional decomposition for Y = (Yn)n<n is discussed in § 2d. 

2. Let (Q,F,(Fn)n<gn,P) be our underlying probability space and let (B, S) be a 
market formed by a bank account B = (Bn)n<n (for which we assume that Bn = 1) 
and a d-dimensional stock S = (S1,..., 82), SË = (St) n<n- Let Fo = {9,Q} and 
let Fy = F. 

Let Y(P) be a non-empty set of martingale measures equivalent to P and let 


f = (fo. fi.---, fn) be a sequence of ¥,-measurable nonnegative functions fn, 
n < N, such that Eg fy < œ, P € P(P),0<k<N. 
We set 
Yn = ess sup Ep (fr | Fp). (1) 


Pex(P), remy 


THEOREM. The sequence Y = (Yn)ngn is a Supermartingale with respect to each 
measure in (P). 


Proof. Basically, this can be carried out along the same lines as the proof (in the 
preceding section) of the fact that the sequence y = (Yn)n<N is a supermartingale. 

We can proceed as follows. 

We choose some (‘basic’) measure in the set Y(P). To avoid extra notation, let 
P be this measure (P is therefore assumed to be a martingale measure). We shall 
verify that Y is a supermartingale with respect to P. 

If P € A(P), then we set 


For n = 0 we set Zo =1. 
Let 


Since P ~ P, it follows that 
P(Zy-1 > 0) =P(Zn-1 > 0) = 1 


for each n < N. 


m ~ 
Setting Mn = Pn — 1, Mn = } mg, and Mp = 0 we have 
kel 


i 


AŽn = Zn- AMn. (3) 
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From (3) we can see that 


n 


Zn = €(M)n = [[ + AM.) = T] a (4) 
k=1 


k=1 


where &(M) is the stochastic exponential (see Chapter II, § la). 

It follows from the above that having chosen P to be the ‘basic’ measure we can 
completely characterize P and its restrictions Pas n < N, by one of the sequences 
(Zn), (Mn), or (Pn). 

By Bayes’s formula ((4) in Chapter V, §3a), for each stopping time 7 (with 
respect to (Fn)) and n < N we have 


1 = 
Eplir | Fn) = z-Ep frZr| Fn) 
n 
T Ep (Pn+1 E -Pr Ír | Fn) 
= Ep (P1 -t Pn ` Pn+1 -< Prfr | Fn) 
T Ep(frZr | Fn), (5) 
where p1 =+: = Pn =1, Pk = Pk, k > n, and Zk = py -- -Pko 
It is Teste to note that defining a measure P by the equality dP = Z y dP 
we obtain P(A), ACF 
=s ; € , <n, 
P(A) = { x : (6) 
P(A), Ae cn k>n. 
Clearly, P ~ P. 
In view of our notation we can rewrite the definition of (1) as follows: 
Yn = ess sup Ep(f-Z- | Fn), 
where ess sup is taken over the class MN of stopping times T such that n ST < N, 


and over P-martingales in the class ÆN of positive martingales Z = a such 


that Zo = -= Zn = 1 or, equivalently, Zo = J1 =--- = Pn = 1- 
The sets with k < N obviously satisfy the relations 


N N N N 
D C Mg and BAA C Ze 


which play an important role in the proof of the supermartingale property of the 
sequence Y = (Yn, Fn)ngN- 

By the definition of the essential supremum (see, e.g., [75; Chapter 1]), there 
(4) 


exist sequences of times 7 and martingales Z” belonging to the classes om and 


ÆN | respectively, such that 
k P 


esssup E(f Z, | F) = limtE (fro Z | Fe), (7) 
remy, ZEZE 2 
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where limt is the limit of an increasing sequence. 


t 
Hence, by the monotone convergence theorem 


Ep (Yk | Fry) — Ep ( AN BERN E(f, Zr | Fr) | F,-1) 
TEM, , ZE 


= Ep (limt Ep (F, OZ | Fu) | Fk- 1) 

= = limt Ep (f, (i sZ ole 1) 

< esssup Ep(f,Z,|Fk-1) 
TEMY, ZEEN 


< esssup — Ep(f,-Z,| F#x-1) = Yk-1, 
TEMP 1, Zeke y 


which is just the required supermartingale property. 


Remark. We can extend Theorem 1 to the control with stopping case, with fy 


z 

replaced by a functional X a, Ag, + fr, where g = (90, 91, ---, gN) is a Sequence 
k=1 

of Fn-measurable functions gn and a = (aj,...,a@y) is a control belonging to a 


sufficiently rich class of predictable sequences. (See § la.2 in this connection.) 


§2c. Complete and Incomplete Markets. 
Main Formulas for Hedging Prices 


1. By the main formula of the price C* (fy; P) of European hedging on an arbitrage- 
free (B, S)-market (§ 1c), 


C*(fn;P)= sup BoEs 2. (1) 
Pe VP) PBN 

We proceed now to a more complex financial instrument, American hedging on 
a (B,S)-market. We shall assume here that the set of martingale measures P(P) 
is non-empty. 

We have mentioned on several occasions that we must often (e.g., in the study of 
American options) consider in place of a single pay-off function fy an entire system 
of functions f = (fn)ngn interpreted as follows. If the buyer exercises the option 
at time n, then the corresponding amount (payable by the writer) is described by 
the ¥,-measurable function fn = fn(w). 

Of course, the writer must choose only strategies m of value XT = (Xi )n<n 
satisfying the hedging condition 


T> f, (P-as.), (2) 


which guarantees his ability to meet the contract terms for each stopping time 
T = T(w) that can be chosen by the buyer to exercise the contract. 
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2. For more precise formulations of the corresponding problems we introduce now 
several definitions. 
Following § 2a we set 


mY = {r= T(w): n < T(w) SN, w EQ}. 
If t= (8 = (Bn)ngN, Y= (Yn)n<N) is a securities portfolio of value 
Xa = BnBn + YnSn, n SN, (3) 


then we shall understand its self-financing property in the sense of the following 
balance condition: 


AXT = BnABn + nASn - ACn (4) 


where C = (Cn)n>o is a non-negative process, Co = 0, and Cy is Fn-measurable. 
(Cf. the ‘consumption’ case in Chapter V, § la). 

To point out the dependence of the capital X7 on the ‘consumption’ we shall 
denote it by X7°° (as in § 1c). 
DEFINITION 1. By the upper price of American hedging (against a system of 
Fpn-measurable payment functions fn, n < N) we mean the quantity 


C(fn; P) = inf{x: 3 (x, C) with XPO = x with XC > f, (P-a.s.) Yr E MẸ}. 
(5) 


DEFINITION 2. We say that a strategy (m, C) is perfect if EC > fn for each 
n < Nand X™° = fy (P-as.). 


THEOREM 1 (Main formula for American hedging price). Assume that P(P) 4 Ø 
and let f = (fn)n<gn be a sequence of non-negative payment functions such that 
fn 
sup Esa < 0, n SN. (6) 
~ B 
Pep) 7n 


Then the upper price of American hedging is 


C(fn;P)= sup Bo Ep . (7) 
PEP(P), remy T 


Proof. We have already nade necessary preparatory work for the proof of this 
formula. 
As in the case of European hedging we prove first that 


sup Bo a < Č(fn;P). (8) 


Pe PP), rem’ Br 
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If the set of hedges is empty, then C(fy;P) = œ (by Definition 1). and (8) is 
obvious. 

Now let 7 be a self-financing hedge (with consumption C’) such that KEG = 
LTO. 

By analogy with (9), we see frorn § lc that 


fp XPS 2 iS (2 ) L ACh 
B, B, Bo 2% Br - Br-y 9) 


for each T € MY. In particular, 
Sk T 
TOR 
2 B)” Bo 


Hence we obtain by assertion 2) of the lemma in Chapter II, §1c that the 


sequence 
N 
S 
(Zrela) 
k=1 k nN 


is a martingale with respect to each measure Pe PP). 
Applying Doob’s stopping theorem (see Chapter V, § 3a) we see from (9) that 


sup. Ess - <4 (10) 


and therefore (8) holds. 
The proof of the reverse inequality to (8) is more complicated. It is sufficient to 


find a portfolio 7 = (5, ¥) and ‘consumption’ Č such that the capital XC satisfies 
the ‘balance’ conditions 


AXE = Bn ABn + In ASn — ACn, n <N, (11) 


the starting capital is 


Kees oap Bo Est, (12) 
PeP(P), remy F, 


and (P-a.s.) i 
AOC > fr, Yre MẸ. (13) 
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To this end we consider a sequence Y= (Yn Jngn With 


Yn = ess sup E= e 
PeE2(P), remy i 


F a (14) 


It follows by the theorem in § 2b that Y= (Yn)n<n is a supermartingale with 
respect to each measure P € (P). On the other hand it follows from § 2d that the 
supermartingale Ý = (Yn)ngn has the optional decomposition (holding P-a.s. for 
each measure P € (P)) 


a= Yor a(S z) a (15) 


k=1 


with some predictable sequence ¥ = (Yn)ngn and non-negative sequence C = 
(Cn)n<n such that Co = 0 and the Cn are F n-measurable. 
Finding ¥ and C from the decomposition (15) we define 8 = (Bn)ngu by setting 


~ ~ aS 
Bn = n- Fn (16) 
The value of the strategy (7, C) is 
Pee = Bn Bn + Fn Sn, (17) 


and ‘balance’ condition (11) is satisfied in view of (15). Since xe = BrYn. the 


capital xe has the following representation by (14): 


XRC =- esssup BnEg (= 
Pew(P), rem 


Fn). (18) 


We conclude from this formula that 
1) the initial capital of the strategy (7, C) is defined by (12); 


2) (#,C) is a hedging strategy in the following sense: xe > fn fora < N, 
or, equivalently, (13) is satisfied; 


3) (z,C) has the following replication property: ae = fyn (P-a.s.). 


The proof is complete and, on the way, we have established also the following 
result (cf. Theorem 2 in § Ic). 
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THEOREM 2 (Main formulas for hedging, consumption, and capital). Under the 
assumptions of Theorem 1 there exists a hedge x = (8,7) and consumption C 


such that the dynamics of the corresponding capital paid = Bn Bn + Yn Sn satisfies 
‘balance’ conditions (11). Moreover, Kee satisfies (12), the dynamics of XEC is 
determined by (18), the components of ¥ = (Yn) and consumption C = (Cy) can be 
found from the optional decomposition (15), and @ = (Bn) can be found from (16). 


4. In connection with the assumption ‘#(P) # 2’ in Theorems 1 and 2 of this 
section and the assumption of the absence of arbitrage in Theorem 1 and 2 of § 1b 
we can make the following observation (taking the hedging of option contracts as 
an example). 

The standard definition of the absence of arbitrage (see Definition 2 in Chap- 
ter V, § 2a) relates to some particular instant N, e.g., the maturity date of a Euro- 
pean option. 

On the other hand, in dealing with American options one must consider in place 
of a single instant N an entire class mY of stopping times Tt. For that reason, in 
place of the assumption ‘#(P) # @’ in Theorems 1 and 2, it would be more logical 
to assume that the market is arbétrage-free in the strong sense (see Definition 3 in 
Chapter V, § 2a). 

By the extended version of the First fundamental theorem (Chapter V, § le) the 
‘arbitrage-free’ state (whether in the strong sense or not) is equivalent in our case 
of discrete time (n < N < oo) and finitely many stocks (d < œo) to the condition 
P(P) # Ø. 

The reasons why we have actually put this condition in the form 2(P) # Ø 
in the statement of the theorem are as follows. First, this property can often be 
verified (even in the continuous-time case); second, the term ‘arbitrage-free in the 
strong sense’ is not widely accepted and the question of the equivalence of different 
interpretations of the ‘absence of arbitrage’ and the condition ¥(P) # Ø is not al- 
ways that simple, especially, in the continuous-time case. (See Chapter VII, § § la,c 
in this connection.) 


5. As regards the writer of an American option, he must first of all choose a strategy 
(x,C’) enabling him to meet the terms of the contract. This imposes the following 


constraint on the capital X™C: the ‘hedging condition’ KEG > fr must be satisfied 
(P-a.s.) for each r € oy’. 

Now, there arises the natural question on the particular instant T = T(w) at 
which it would be reasonable for the buyer to exercise the contract. 

We shall consider the case of a complete arbitrage-free market, which in the 
present context of discrete time (n < N < oo) and finitely many stocks (d < œœ) 
is equivalent to the existence of a unique martingale measure P (the Second funda- 
mental theorem). 

Before answering this question we reformulate and somewhat improve on The- 
orems 1 and 2 in the case under consideration. 
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THEOREM 3. Let P be a unique martingale measure ((P(P)| = 1) and let f = 


nìn<N be a sequence of non-negative pay-off functions with Es <oojn<N. 
ns P B 
n 
Then we have the following results. 


1) The upper price is 


C(fn;P) = sup Bo Ege. (19) 
remy T 


0 


2) There exists a self-financing strategy (7, C) such that the corresponding cap- 


ital X™° satisfies the ‘balance’ condition 


AX®S = By ABn + Jn A - ACn: (20) 

eee = sup Bo Este (= C (fu; P)); (21) 
remy T 

frn Neem, (22) 


The dynamics of XHC is described by the formulas 


xtC = Bn ess sup Es (= | Fn ). (23) 


remy 


3) The components ¥ = (Yn) and Č = (Cy) can be determined from the Doob 


(Yn, Fn, P)ngy with 


decomposition for the supermartingale Y 


f- 
Yn = ess sup Es »( 
remy B, 


which has the following form: 
% = 2 Sk = 
Yn = for Died) -= Cn. (24) 


The components B = (Bn)n<Nn are defined by the relations 


Bn = Yn me, (25) 
4) In the optimal stopping problem of finding 
fr 
sup Ex =, 26 
p Epp (26) 


N 
TEMG 
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the stopping time 


Fo minfosn <M Ya = Zh (27) 
is optimal, i.e., f F 
o EBB = Eş E ‘ (28) 
Moreover, z 
xe = fz (P-a.s.), (29) 


and the sequence (Yn)n<n is the smallest P-supermartingale majorizing (fn)ngN- 


Proof. Assertions 1)—3) follow from Theorems 1 and 2. It should be noted only 
that, since the martingale measure is unique, we need not refer to optional decom- 
positions; it suffices to use directly the Doob decomposition 


Yn = Mn z Ch (30) 
of the superrnartingale Y= (Yn, Fn, P)ngN (see Chapter III, § 1b). 


By the ertended version of the Second fundamental theorem (Chapter V, § 4f) 
the martingale M = (Mn, Fn, P) can be represented as follows: 


n 
n= for nae, (31) 
k=1 
Together with (30), this brings us to the required representation (24). 
As regards assertion 4), this is a special case of Corollaries 1 and 3 to Theorem 1 
in § 2a. 
6. We now proceed directly to the question whether (judging on the basis of the 
information contained in the flow (Fn)) it would be ‘reasonable’ of the buyer to 
choose the exercise tirne T. 
_ Both buyer and writer operate with the understanding that the option price 
C(fn; P) defined by (19) is mutually acceptable. (See Chapter V, § 1b.4.) 


We consider now all strategies (#,C) with initial capital xe 7 C(fN; P) that 
bring about hedging, i.e., strategies such that Rae > fn for n < N. We denote 


the class of these strategies by T(C( En: P)). 
This class contains a strategy (7,C) of the minimum value, such that 


fa <S XŽČ eK. ngN, (32) 
for each (7, C) € I[(Č(fn;P)). For it follows from the ‘balance’ conditions that 
KES 


Yn = n [ÊS N,isa P-supermartingale majorizing a n < N, while it 
n n 


&% 
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follows from assertion 4) in Theorem 3 that the sequence Yn, n < N, is the smallest 


P-supermartingale majorizing t, n< N. 
n 


Hence Yn < Yn for n < N. Together with the relation 


#,C 
fn 6,0 MOS, 
Bn Brn 


this proves inequalities (32). 
These inequalities show that for each stopping time T we have 
fr SAPO AES. (33) 
Clearly, the buyer must chose 7 such that for no strategy (7, C) € I (Cf; P)) the 
writer would get profits XDS — f+ > 0 with positive probability. In other words, 
the buyer may consider only the stopping times 7 such that 
fr = XPOS (Pas), Y (x, C) = [][(C(fn;P)). (34) 
All this justifies the following definition. 
DEFINITION. Stopping times 7 satisfying (34) are called rational exercise times. 


THEOREM 4. Each stopping time r that solves the optimal stopping problem (26) 
(i.e, each stopping time satisfying (28)) is a rational exercise time. 


Proof. Let (#,C) = T(C( fy; P)). Then, in view of the P-supermartingale property 
FC 
of the sequence Y,, = — 


, n < N, we see that 
n 


a 
~ 
5 


Pion se ee X7 
C(fin;P) = Xo” > BoE -p 


= Bo sup Es = C(fy;P). 


remy B, 
Hence 7S 
poe og fr 
P B, P B,’ 


which, coupled with the property XTS > f,, proves that, in fact, XTC = f; 
(P-a.s.), ie., T is a rational time. 

Remark. It may be useful to repeat at this point that a solution of the optimal 
stopping problem (26) provides one with both value of the rational price C(f;P) 


and rational exercise time. Usually, one cannot find C( FN; P) or T separately; they 
can be found only in tandem, by solving (26). 
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§2d. Optional Decomposition 


1. For a filtered probability space (Q, F, (Fn)ngn, P) with Fo = {9,Q}, let X = 


(Xn)ngn be a real-valued process and let S = (Sn)ngn, where Sn = (S1,..., S2), 
be a R¢-valued process that are both adapted to (Fn)n<gn: the Xn and St must 
be Fn-measurable for n < N andz=1,...,d. 


Let Y(P) be the set of probability measures P on (Q, F) such that P ~ P and 
S is a P-martingale. We assume that 2(P) # Ø. 

As regards the process X = (Xn)ngn, Our main assumption is that it is a 
supermartingale with respect to each measure Pe PIP). 


Considering X with respect to a particular measure Pe Y(P) we see that, by 
the Doob decomposition (Chapter II, § 1b), 


Xn = Xo+ MP - CP’, (1) 


(CP, Fn1,P) is a non- 
0. The components MCP) 


where mP) = (MP), Fn, P) is a martingale, cl) 


decreasing predictable process, MEP’ = 0, and an 


il 


and CCP) in (1) depend on our choice of the measure P, which we emphasize by our 
notation. 

In the theorem following next we describe another decomposition of X, the 
so-called optional decomposition. It is remarkable for its universal nature: its 
components (see (2)) are the same for all P € 2(P). 


THEOREM. If a process X is a supermartingale with respect to each martingale 
measure P € 2(P), then it has an (optional) decomposition 


n 
= F2 Yk ASk)- Cn, nN, (2) 


with predictable R¢-valued process y = (Ye)kgN and non-decreasing process 
C = (Cn)ngn Of Fn-measurable variables Cn. 


Before turning to the proof we note that there exists an essential difference 


between (1) and (2): the variables ele in (1) are ¥,~1-measurable, whereas Cn 
in (2) is Fn-measurable. It is just for this reason that (2) is called an optional 
decomposition. 


Remark. In the present, discrete-time, case one says that a process C = (Ca)n<Nn is 
optional with respect to (Fn)n<n when it is merely consistent with (or adapted to) 
the filtration (Fn) ncn, i.e., when Cn is Fn-measurable, n > 0. See Chapter III, § 5a 
and [250; Chapter I, § 1c] for the concept of optional process. 
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First versions of the above theorem were established in [136] and [281] for the 
continuous-time case (as already mentioned in § Ic). 

Before long, several other papers devoted to both diserete and continuous time 
were published ((99], [163]-[165]), in which, in particular, the assumptions of [136] 
and [281] were weakened and various versions of proofs were suggested. 

The proof below follows the scheme of H. Follmer and Yu. M. Kabanov 
[163], [164], which is based on the idea of treating the yẹ in (2) as Lagrange 
multipliers in certain problems of optimization with constraints. (We shall also use 
several results of Chapter V, § 2e in the proof). 


2. In accordance with [163] and [164], we shall have established the decomposi- 
tion (2) once we have shown that for each n = 1,...,N the variables AX, = 
Xn ~ Xn—1 can be represented as the differences 


AX) = (yn; ASn) — Cn (3) 


where yn is a ¥p—1-measurable R¢-valued variable and cn is a non-negative 
¥ny-measurable variable. 

To obtain such a representation of AX, it is in fact sufficient to show that 
(under the above assumptions about X and S) there exists a #,_1-measurable 
R¢-valued variable yn such that 


AXn = (ay ASn) < 0. (4) 


In this case, we can take (Yn, ASn) — AXn as the required variable cn. 
We note also that if P € A(P), then 


E5|ASn| < œ, Es(ASn|Fn-1) = 0 (5) 
and 
E5|AXn| < œ, E5(AXn| Fn-1) < 0. (6) 
~ PEN ~ dP» 
If Pa and Pn are the restrictions of P and P to F and Zn = WP, then by 
Bayes’s formula (Chapter V, § 3a), 
Es(ASn | Fn—1) = Ep (anA Sn | Fn-1); (7) 
E5(AXn | Fn—1) = E5(2nAXn | Fn~1), (8) 
where Zn = Zo 


Hence it is easy to see that (4) is a consequence of the following general result 
(where we set € = AX, and 7 = AS,), which is also of independent, ‘purely 
probabilistic’, interest. 
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LEMMA. Let (Q,¥,P) be a probability space and let € and = (n',...,n%) bea 
real-valued variable and an R¢-valued variable in this space. 

Let be ao-subalgebra of F and let Z be the set of all random variables z > 0 
such that, P-a.s., 


E(z|¥)=1,  E(lélz[8) <œ,  E(lnjz| 8) < 00 (9) 
and 
E(z7| 4) = 0. (10) 
If Z#@ and 
E(zé|9) < 0 (11) 


for all z € Z, then there exists a @-measurable R¢-valued vector * such that 


€+(A*,n) <0 (P-a.s.). (12) 


Proof. a) The idea of the proof is already transparent when @ is a trivial ø- 
subalgebra, i.e., @ = {Ø, Q}. In this case the required vector A* is nonrandom 
and can be treated (as shown below) as a Lagrange multiplier in a certain opti- 
mization problem. 

If 4 is a nontrivial o-subalgebra, then we can carry out the same arguments 
for each w and, again, obtain a vector A* (not uniquely defined, in general), which 
depends on w. After that, the entire problem is reduced to the proof that A* can 
be chosen 4-measurable. 

We recall that we encountered the same measurability problem in Chapter V, 
§ 2e, in our proof of the extended version of the First fundamental theorem (in the dis- 
cussion of the implications a’) => e) and e) ==> b)). We referred there to cer- 
tain general results on the existence of a measurable selection (Lemma 2 in Chap- 
ter V, § 2e). 

The same techniques are applicable in our context for the proof of the existence 
of a G-measurable selection A*. Referring to [163] and [164] for detail we note that 
there is no such problem of measurability in the case of a discrete space Q. 


b) Thus, we shall assume that 4 = {9,Q}. 
Let Q = Q(dz, dy) be the measure in R x R? generated by the variables € and 
SO tA Ae, 
Q(dz, dx) = P(é € dz, € dy). 


Without loss of generality we can assume that the family of random variables 
ni,..., 9% is (P-a.s.) a linearly independent system, i.e., if aln! +... + afn = 0 
(P-a.s.) for some coefficients al,...,a%, then a! =--- = af = 0. Indeed, 7!,...,% 
enter (12) in a linear manner. If they were linearly dependent, then the problem 
could be reduced to another, with vector 7 of lower dimension. 
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As in Chapter V, § 2e, let L°(Q) be the relative interior of the closed convex 
hull L(Q) of the topological support K (Q) of the measure Q. 

Let 2’ = (x,y), £ E€ RË, y € R@*!, and let Z(Q) be the family of Borel functions 
z = z(x') > 0 such that Egz = 1 and Eg|z’|z < oo. 

Let 


SQ) = {y(z): p(z) = Eg2’z, z € Z(Q)} 
be the family of barycenters (of the measures dQ’ = zdQ). 
By Lemma 1 in Chapter V, §2e we obtain that L°(Q) C #(Q) (in fact, 
L°(Q) = ®(Q)) and if 0 ¢ L°(Q), then there exists y’ € R¢+! such that 


Q{zr': (/,2')>0}=1 and Qfr': (7,2') > 0} > 0: (13) 


To prove the existence of a vector A* with property (12) we consider separately 
the following two cases: 


(i) 0 ¢ L°(Q), 
(ii) 0 € L°(Q). 
Case (i). By (13), there exist numbers y and 7!,...,74 such that (P-a.s.) 


yE + (yin +--+ nt) 2 0 (14) 
and, with positive P-probability, 
yE + (yin +--+ tnt) > 0. (15) 


We claim that y # 0. For if y = 0, then yn! + --- + yf? > 0 (P-as.). By 
hypothesis, there exists a martingale measure P ~ P such that 


Es(yin) +--+ +-7%n%) = 0, 


therefore yn! + ---+~2n4 = 0 (both P- and P-a.s.). 

Since !,...,74 are assumed to be linearly independent, this means that 
y! =---= yf = 0, which, however, contradicts (15). 

Hence y # 0 and we see from relation (14) and our assumptions that Egé <0, 
Ep7 = 0 that y < 0. 


Setting àt = H: i= 1,...,d, we obtain by (14) the following inequality: 
Y 


E+ (Aln! +--+ Ant) < 0, 


which proves (12) with A* = (Al,...,A%) in case (i). 
Case (ii) is slightly more complicated. It is in this case that we shall use the 
idea of [163] and [164] about Lagrange multipliers. 
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Let 
pelz) = Egzz, pn(z) = Eqyz 
be the components of the barycenter p(z). 
We set 
Zo(Q) = {z € Z(Q): gql2) = 0} 
and 
o(Q) = {9(z) = (ve(z), Pn (2): z E€ Z0(Q)}. 
By hypothesis, Z)(Q) # @ and 


z E€ Zo(Q) => pel?) <0. (16) 


If 0 € L°(Q), then, as already observed, L°(Q) C ©(Q), and by Chapter V, § 2e 
we obtain that there exists zo € Zo(Q) such that ye(zo) = 0. Hence, in case (ii), 


sup ye(2) = 0, (17) 
z€Zo(Q) 


which we can interpret as follows: in this case the value of f* in the optimization 
problem 
“find f* = sup pelz? (18) 
z€Zo(Q) 
is equal to zero. 
Following [163] and [164] we now reformulate (18) as an optimization problem 
with constraints: 


“find f* = sup ye(z) under the additional constraint p(z) = 0”. (19) 
z€Z(Q) 


According to the principles of variational calculus, for some non-zero vector X* 
(the Lagrange multiplier) the problem (19) is equivalent to the following optimiza- 
tion problem: 

“find f* = sup (pelz) + A* yn (z))”- (20) 
z€Z(Q) 
(For simplicity, we denote here and below the scalar product (a,b) of vectors a and 
b by ab.) 

We shall now prove that the problems (19) and (20) are equivalent (this is 
interesting on its own, although it would suffice for our aims to show that, under 
the assumption (11), we have 


sup (pelz) + A* pn (2) < 0; (21) 
z€Z(Q) 


at any rate, for some non-zero vector A*). 
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Let 
A= {(z,y)€ Rx Ri: z < pe(z) and y = p(z) for some z € Z(Q)}. 


This set is nonempty and convex. By assumption (ii), (0,0) ¢ A, therefore we 
can separate the origin and A by a hyperplane, i.e., there exists a nonzero vector 
à = (Aq, A2) E€ R x R? such that 


Az t+ rA2y <0 (22) 


for all (x,y) in the closure of A. (See, e.g., [241; §0.1]; note that we have in fact 
employed the idea of ‘separation’ also in case (i)). 
Note that A contains all points (z,0) with negative z. Hence A; È 0 in (22). 
Further, if we assume that 41 = 0, then we see from (21) that 


Eg(Aay)z = A2EQyz = A2¥n(z) < 0 (23) 


for all z € Z(Q). Further, z € Z(Q), so that Agy < 0 (Q-a.s.), ie., A27 < 0 (P-a.s.). 
Since A(P) # Ø, it follows that E5A2” = 0 for some measure P in 2(P). 


Combined with the property Ao7 < 0 (P-a.s.) this shows that we have linear 
dependence (Ag7 = 0), which is ruled out by the above assumption. Hence Ag = 0. 
Consequently, if Az = 0, then also Ag = 0, which contradicts the assumption 
that the vector (1, A2) in (21) is not zero. 
Thus, A; > 0. 


We now set A* = 2, Then we see from (21) and the definition of A that for all 
1 
z E€ Z(Q) and € > 0 we have 


(pelz) — £) +A* pn (z) < 0. 
Passing to the limit as € + 0 we obtain 
pelz) +à" p(z) <0, =z € Z(Q), 


which is equivalent to the inequality 


sup (pelz) + A* vn (z)) < 0. (24) 
z€Z(Q) 


We now observe that, obviously, 


sup (ye(z) +AGy(z)) > sup (ye(z) + AGy(z)) = sup gve(z) (25) 
z€Z(Q) z€Zq(Q) z€Zo(Q) 


for each  € RÊ. 
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If (ii) holds, then the right-hand side of (25) is equal to zero, while the left-hand 
side vanishes for A = A*. Hence, under assumption (ii) we obtain 


sup ye(z)=0 = > sup (pelz) + A*yp(z)) = 0, 
2€Zo(Q) z€Z(Q) 


which establishes the equivalence of (19) and (20) and can be interpreted as fol- 
lows: the Lagrange multiplier ‘lifts’ the constraint y,(z) = 0 in the optimization 
problem (19). 

We now turn back to (24) or, equivalently, to the inequality 


pelz) + A* p(z) <S 0, z € Z(Q), (26) 
which can be rewritten as follows: 
Egz(z + A*y) <0, z E€ Z(Q). 


The space Z(Q) is ‘sufficiently rich’, therefore r+A*y < 0 (Q-a.s.), which proves 
required property (12) in the case of 4 = {9, Q}. 


In the general case, as already mentioned, one must consider in place of Q(dz, dy) 
regular conditional distributions 


Q(w; dz, dy) = P(E € dz, € dy | 9)(w) 


and prove the existence of A* = A*(w) for each w. At the final stage one must show 
that we can choose a -measurable version of the funetion A* = A*(w), w E€ Q, by 
using general results on measurable selections, such as Lemma 2 in Chapter V, § 2e. 
See the details in [163] and [164]. 


3. Scheme of Series of ‘Large’ Arbitrage-Free Markets 
and Asymptotic Arbitrage 


§3a. One Model of ‘Large’ Financial Markets 


1. We encountered already ‘large’ markets and the concept of asymptotic arbitrage 
in Chapter I, § 2d, in the discussion of the basic principles of Arbitrage pricing theory 
(APT) pioneered by S. Ross [412]. 

As in H. Markowitz’s theory ([331]-[333], see also Chapter I, § 2b), based on 
the analysis of the mean value and variance of the capital corresponding to various 
investment portfolios, asymptotic arbitrage in Ross’s theory is also defined in terms 
of these parameters. 

In the present section we define asymptotic arbitrage in a somewhat different 
manner, which is more consistent with the concept of arbitrage considered in Chap- 
ter V, §2a and more adequate to the martingale approach permeating our entire 
presentation. 


2. Developing further our initial model of a (B, S)-market formed by a bank account 
B = (Bk)k<n and a d-dimensional stock S = (Sk)kgn (where Sp = (S1, oe 580) 
both defined on sorne filtered probability space 

(9, F, (Faden: P), 


we assume now that we have a scheme of series of n-markets 


(B”, S”) = (BE. Sk)k<k(n) 


with Sp = (ee ee Son), and each market is defined on a probability space 
(Q, F”, (FE )k<kinp P”) (1) 
‘of its own’. 


Here we assume that n > 1, Ff = {2,97}, F” = FP 
d(n) < æ. 


(n) with k(n) < œ and 
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We are mainly interested in the cases when k(n) > œ and (or) d(n) > œ as 
n — oo. It is in this sense that we shall speak about ‘large’ markets. 


Remark. In our considerations of the scheme of series of probability spaces the 
index n is the number of a series, while k plays the role of a time parameter. 


3. Let XT) = Chere be a capital corresponding to some self-financing 
portfolio m(n) in a (B", S")-market. 

Recall that we agree to assume that the variables Bf are positive and ¥7_,- 
measurable. We explained at the end of §2b in Chapter V that we can assume 
without loss of generality that Bẹ = 1, which corresponds to a transition to dis- 
counted prices. In this case, 


k 
XE = x9 + DOP AS), 
l=1 
d(n) g S 
where (y, AS?) = D y” AS”. 
t= 
DEFINITION 1. We shall say that a sequence of strategies 7 = ((n))n>1 realizes an 
asymptotic arbitrage in a scheme of series (B, S) = {(B",S”), n > 1} of n-markets 
(B", S”) if 


: n(n) _ 
lim X` = 0, (2) 
Se > —c(n) (P"-as.), n21, (3) 
where 0 < c(n) | 0 as n + œ and 
>. T pny yr(n) 
lim Tim P (Xin) >e)>0. (4) 


Using the above-introduced concepts and notation we can say (slightly broad- 
ening the arguments of Chapter V, §2a) that the asymptotic arbitrage in APT 
considered in Chapter I, § 2d occurs if there exists a subsequence (n’) C (n) anda 
sequence of strategies (7 (n’)) such that xe) = Qand xy > œ, Ds 0 
as n’ — oo. 

Definition (4) of asymptotic arbitrage is more convenient and preferable from 
the ‘martingale’ standpoint to that of APT, which can be explained as follows. 

First, we can consider (4) as a natural generalization of our earlier definition of 
opportunities for arbitrage (Chapter V, § 2a), and the latter, as we know from the 
First fundamental theorem, relates arbitrage theory and the theory of martingales 
and stochastic calculus in a straightforward way. 

Second, taking the definition (4) or a similar one (see [260], [261], [273]) we can 
find effective criteria of the absence of asymptotic arbitrage. They include criteria in 
terms of such well-known concepts of the theory of stochastic processes as Hellinger 
integrals and Hellinger processes (see [250; Chapter VJ). 
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4. DEFINITION 2. We say that a (B, S)-market that is a collection {(B",S"), n > 1} 
of n-markets is locally arbitrage-free if the market ( B”, S”) is arbitrage-free (Chap- 
ter V, § 2a) for each n > 1. 


The central question in what follows is to find conditions ensuring that there is 
no asymptotic arbitrage (in the sense of Definition 1 above) on a locally arbitrage- 
free (B,S)-market. 

The existence of asymptotic arbitrage as n —> oo can have various reasons: the 
growth in the number of shares (d(n) —> œœ), the increase of the time interval 
(k(n) => œ; see Example 2 in Chapter V, § 2b), breaking down of the asymptotic 
equivalence of measures. It can also be brought about by a combination of these 
factors. 

In connection with the above-mentioned asymptotic equivalence of measures and 
the related asymptotic analogues of absolute continuity and singularity of sequences 
of probability measures it should be noted that their precise definitions involve the 
concepts of contiguity and complete asymptotic separability (see [250; Chapter V], 
where one can find also criteria of these properties formulated in terms of Hellinger 
integrals and processes). 

The importance of these concepts in the problem of asymptotic arbitrage on 
large markets has been pointed out for the first time by Yu. M. Kabanov and 
D. O. Kramkov [261] who have introduced the concepts of asymptotic arbitrage of 
the first and the second kinds. (In accordance with the nomenclature in [261] the 
asymptotic arbitrage of Definition 1 is of the first kind.) Later on, the theory of 
asymptotic arbitrage has been significantly developed by I. Klein and W. Schacher- 
mayer [273] and by Yu. M. Kabanov and D. O. Kramkov in [260]. 


§ 3b. Criteria of the Absence of Asymptotic Arbitrage 


1. Let X7") = Ce eis be the capital corresponding to some self-financing 
portfolio m(n) in a (B", S”)-market with B? = 1: 


k 
xp”) = x9 + oop, ASP). (1) 
l=1 
If Q” is a measure on (Q", F”) such that Q” « P”, then we obtain by Bayes’s 
formula ((4) in Chapter V, § 3a) that (Q”-a.s.) 


n(n n 1 An n n 
EQn Bele | Fe) = gure Zhen) | Fk) (2) 


d n 
where Z? = Pa QR = Q7 |F}, and P} = P”| FP. (Here we assume that the 
k 


expectation on the left-hand side of (2) is well defined.) 
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We assume that each of the n-markets (B”, S”), n > 1, is arbitrage-free, and 
therefore, in accordance with the First fundamental theorem (Chapter V, § § 2b, e), 
the family of martingale measures (P”) is nonempty. 

Let P? € (P”) and let x(n) be a strategy such that 


T(n) — 
Then we obtain by the lemma in Chapter II, §1c, that the sequence xT) = 


Calg vane is a P”-martingale. Consequently, ES, Pane < oo and 


xe) S85, (A ep) (4) 


for k < k(n). Clearly, Epa |X Eo ZR ay] = 


EI < œ, and, by (2) and (4), 
Xp ZR = Epn (XG Zpen) | FE) (PM and P?-a.s.). (5) 
Hence assumption (3) ensures in our case of discrete time k < k(n) < œ that 
) p ) 
(a PRP cies, Bd Oa Zhe a 


are martingales. (Cf. the lemma in Chapter V, § 3d.) 
In particular, 


xe = son fa (6) 
xg") = Epa Xp Zh. (7) 


Each of these equivalent relations can be used in the search of conditions ensuring 
that the strategy 1(n) is arbitrage-free and the sequence of strategies r = (1(”))n>1 
is asymptotically arbitrage-free. (Note that, in fact, we used just these relations in 
our proof of the sufficiency in the First fundamental theorem, in Chapter V, § 2c.) 


Remark. It is not necessary for formulas (4)~(7) to hold that P” ~ P”, It suffices 
that P” < P”. We must nevertheless assume that P” « P” if we want to deduce 
from (7), say, the absence of arbitrage. For an explanation assume that Xo m= 0, 
x” > 0 (P”-a.s.), and A = {xe > 0}. Then it is clear from (7) that we 


k(n) 7 
can say nothing on the probability P(A) of a set A if Z? | = 0 on this set. This 


k(n) 
explains why we can deduce the equality PHO = = 0) = 1 from the assumption 


Pr (XTY > 0) = Land the relation 0 = Epa Xp CY Zg „ only if P”(ZR > 0) = 1 


(By assertion f) of the theorem in Chapter V. § 3a, this means that P” « P”) 
Thus, the condition (in the definition) that the martingale measure P” be equiv- 
alent to the measure P” ensures, in particular, that 


0< Zin) <œ (P”-a.s.). (8) 
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2. For simplicity, we start our search of criteria of the absence of asymptotic arbitrage 
from the stationary case, when we have a (B, S)-market, (B, S) = {(B”, S”), n > 1}, 
of the following structure. 

There exist a probability space (Q, F, (Fk)kz0: P) F = V Fk and a (d + 1)- 
dimensional process (B, S) = (Bk, Sk)k>z0 of F~_1-measurable variables Bẹ and 
¥,-measurable variables S$; = (SH, ES S¢) such that each n-market has the struc- 
ture (B”, S”) = (Bry Sk)k<k(n) With k(n) = n. 

Clearly (using the language of the scheme of series), we can assume that the mar- 
ket (B”, S”) is defined on its own probability space (Q, F”, (FP )kcn, P”), which 
has the properties F” = Fn, Fp = Fp, k Cn, and P” = P| F”. 

We can express this otherwise: in the ‘stationary case’ the number d(n) of shares 
is independent of n (d(n) = d) and the market (B"+!, S"+1) is an ‘extension’ of 
the market (B”, S”) for each n. 

Let P = {(Pr)e>1} be the family of sequences (Pr)k>1 of martingale measures 
Py that have the property of compatibility: Prat | Fk = P,, k 21. 

Given such sequence of measures (Pr)k>1 we can consider the associated se- 


quence Z = (Zk)kzż0 of Radon Nikodym derivatives Z, = ae, k>1,Z=1. 
k 
Let 


and let 


dP, 
Zæ = [z ifa = Tim Pe (Presi € EN: 


Although we do not assume the existence of a measure P on (Q, F) such 
that P =P | Fk, note that the sequence (Zk, Fk)k>o 18 nevertheless a (positive) 
P-martingale thanks to the property Pray | Fp = P}. Hence, by Doob’s convergence 
theorem (Chapter V, § 3a) there exists (P-a.s.) lim Z, (= Zæ), and moreover, 
0< EZ% <1. 

THEOREM 1 (the stationary case). If (B,S) = {(B”, S”) 
arbitrage-free market, then the condition 


n > 1} is a locally 


? 


limlim inf P(Z, <«)=0 9 

£10 k Zk EZk ( A ) ( ) 
is necessary and sufficient, while the condition 

lim inf P(Z%<£)=0 (10) 

el0 Zeon EZco 


is sufficient for the absence of asymptotic arbitrage. 
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Proof. The sufficiency of condition (9) is relatively easy to establish; we present its 
proof below. (The sufficiency of (10) is a consequence of the implication (10) = (9).) 
The proof of the necessity is more complicated. It is based on several results on 
contiguity of probability measures and is presented in the next section, § 3c (sub- 
section 9). 

Let (Pr)n>1 € P, and let m = ((n))n>1 be a sequence of strategies on a (B,S)- 
market, (B,S) = {(B",S"), n > 1}, satisfying condition (3) in §3a. Then we 
have (3) and therefore also (6), which takes the following form in our ‘stationary 
case’: 

xe”) = EZ, xa), (11) 


Hence, choosing £ > 0 we obtain 
EZ XZ™ SEZ xr) (1(-c(n) < xt <0) 
+1(0< Xn <e) +.1(Xn > e)) 
> —e(n) + EZn Xn T(Zy > e)(Xa >e) 
> —e(n) +<2P(XA™ >e, Zn > £) 
> —e(n) + e2[P(X™™ > e) — P(Zn, £), 
and therefore 
eee + c(n) +€2P(Zp_ < £) > e2P(X™™ >e). (12) 
If conditions (2) and (3) in § 7a are satisfied, then we see from (12) that 


CAES reca Si T(n) > 
Poimi Pies P(Zn <£) 2 in lim P(Xa 26) (13) 


(because (Pa)n>1 is an arbitrary sequence in P, 

Hence it is clear that if (9) and, of course, (10) hold, then the sequence of 
strategies 7 = (m(n))n>1 with properties (2)—(4) in § 3a cannot realize asymptotic 
arbitrage. 

i= > D z Pr 
COROLLARY. Let (Pn)n>1 be some sequence in P and let Zo = lim Po Then 


n 
the condition P(Zæ > 0) = 1 ensures the absence of asymptotic arbitrage. 


3. We now proceed to the general case of arbitrage-free n-markets (B", S”), n > 1, 
defined on filtered probability spaces 


(Q7, F”, (Fe )kck(n) P”) 
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‘of their own’, where Fin) = F", 


pn 


dP 
pn pr n — k(n) . 
If Phin) is a martingale measure, P Pea wee Prin)? and Zin) = TPR , then in a 
similar way to (12) we obtain 
xm") 4 e(n) +62 Pr(Ze Rn) <6) Be us Bo >e). (14) 


We set 


dapre 
2 : aP a(n) p 5 
THEOREM 2. Let (B,S) = {(B",S"),n > 1} be a ‘large’ locally arbitrage-free 
market. Then the condition 


lim lim | inf = =P"(Zp,) <€) =0 15 
En oe eee (Zin) < €) (15) 


is necessary and sufficient for the absence of asymptotic arbitrage. 


Proof. The sufficiency of (15) follows from (14) as in the ‘stationary’ case. The 
proof of the necessity see in § 3c.9. 


$ 3c. Asymptotic Arbitrage and Contiguity 


1. It is clear from our previous discussion of arbitrage theory in this chapter that 
the issue of the absolute continuity of probability measures plays an important role 
there. As will be obvious from what we write below, a crucial role in the theory of 
asymptotic arbitrage is that of the concept of contiguity of probability measures, 
one of important concepts used in asymptotic problems of mathematical statistics. 

To introduce this concept in a most natural way we consider the stationary case 
first (see § 3b.2 for the definition and notation). 

Let Gans be a sequence of f (compatible) martingale measures in P. Assume 
that there exists also a measure P on (Q, F) such that P |Fn = Ph, n>. 

We recall (see alae V, §3a) that two measures, P and P, are said to be 
locally os ue (P! S P) if Pn ~ P, n > 1. It should be pointed out here that the 
relation P 'S5 P does not mean in general that P<P,P<« P, or P ~P. 

It is clear from Theorem 1 in § 3b that if 


P(Zæ > 0) = 1, (1) 


where Zæ = lim Zn, Zn = Po then there is no asymptotic arbitrage. 
n 
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By the theorem in Chapter V, § 3a condition (1) is equivalent to the relation 


loc 


P<P (under our assumption that P P). Thus, we can regard the relation 


P <P (or (1) if one likes it better) as just an additional (to pS P) restriction on 
the behavior of the probability measures associated with n-markets that prevents 
asymptotic arbitrage as n > œ. 

In the case when there exists no measure P on the probability space 
(Q, F, (Fn)nz1) with F = V Fn such that P|Fn = Pn, n > 1, the following 
definition is helpful for finding a counterpart to the assertion 


P(Zoo > 0) = 1 > P&P. (2) 


DEFINITION 1. Let Q” and Q”, n > 1, be probability measures on measurable 
spaces (E”, £"). Then we say that the sequence (Q”)n>1 is contiguous with respect 
to (Q”)n>1 (our notation is (Q”) < (Q”)) if for all sequences of sets A” € &" such 
that Q” (A”) + 0 as n - œ we have Q"(A") + 0 as n => 00. 


Remark 1. In the case when the sets (E”, 8”) and the measures Q” and Qn are 
independent on n ((E”, 8”) = (E,&), Q” =Q, Q” = Q) contiguity (Q”) < (Q”) 
becomes the standard property of ike absolute continuity Q < Q of the measures Q 
and Q on (E, 8). 


THEOREM 1 (the stationary case). Let (Pn) be a sequence of martingale measures 
in P. Then 7 
P(Z% >0)=1 > (Pr) < (Pn). (3) 


The condition of contiguity (Pn) < (Pn) ensures the absence of asymptotic 
arbitrage. 

If (Pn) is a unique martingale sequence, then the condition (Py) < (Pp) is 
necessary aud sufficient for the absence of asymptotic arbitrage. 


Proof. This is an immediate consequence of Theorem 1 in the previous section and 
Lemma 1 below, which contains several useful contiguity criteria. For a formulation 
we require additional definitions and notation. 


2. Let Q and Q be two probability measures on the measurable space (E, 6), 
iQ. d 3 

Q= 4(Q+Q). We set 3 = ei a and Z = > (Here we choose representatives 

3 and 3 of the Radon- Nikodyin derivatives such that 3+ 3 = 2.) 


We recall that, in accordance with the Lebesgue decomposition (see, e.g., [439; 
Chapter III, §9]), we can represent Q as follows: 


Q=Q14+ Q2, 
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where 


Qi(A)=EQZI4 and Q9(A) = Q(AN(Z = œ)). 


Since Q(Z < 00) = 1, it follows that Qı < Q and Q21Q. 
Hence Z is just the Radon-Nikodyi derivative of the absolutely continuous 


component of Q with respect to the measure Q. (« is in this sense that we shall 


use the notation Z = R) 


DEFINITION 2. Let œ € (0.1) and let 
H (a; Q, Q) = Eqs, (4) 
A 1 oe 
H(Q,Q) = H(5;9,Q), (5) 


PQQ) = /1— H(Q,Q). (6) 


We call the quantity H(a;Q, Q) the Hellinger integral of order œ (of the measures 
Q and Q); we call the quantity H(Q,Q) simply the Hellinger integral, and p(Q,Q) 
is the Hellinger distance between Q and Q. 


It can be proved (sce, e.g., [250; Chapter IV, § lal) that H(a; Q,Q) is in fact 
independent of the dominating measure Q. This explains the widespread symbolic 
notation 


H(œ Q,Q) = [ (aay (dQ), (7) 
H(Q,Q) = I dQ dQ, (8) 
pa, = 5 f (vid aa)” (9) 


EXAMPLE. Let Q = Q] x Qg x=, Q = Qı x Q2 x+, where Q; and Qk are the 
Gaussian measures on (R, %(R)) with densities 


P 2 
1 al ae 
9k (2) = Vr oR e e 


and 
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Then 


and therefore 


Ha = exp{ aus) So (ey. (10) 


k=1 


LEMMA 1. Let (E”, 8") be measurable spaces endowed with probability measures 
me n 
Q” and Q”, n 2 1. Then the following conditions are equivalent (tere git a 
dQ” An 1 a 
n— = ion ny \. 
Z ET zQ +ã): 
a) (Q") < (Q”); 
b) ii lim Q"(3" < £) = 0; 
c) Pa Tim Q(z" > N)=0; 
)1 


d tim im H (o: Q”, Q”) = = 


Proof. This can be found in [250; Chapter V, Lemma 1.6]. 


The proof of equivalence (3) in Theorem 1 is an immediate consequence of the 
equivalence of a) and c) in Lemma 1 in the case of Q” = Pn and Q” = Ph: 


S eee d 
(Pn) < (Pah) <> lim lim p(s > N) = 
Ntcoo n \dPn 


23 mimp Oe, Pee 
eLQ n dP» 

=> lim P(Zeo < €) = 0 <> P(Z% > 0) =1, 
E 


dP 
where Zo, is equal to lim ap which exists P-a.s. 
n 


The absence of asymptotic arbitrage under the assumption (Pa) < (Pn) is a 
consequence of the corollary to Theorem 1 in § 3b and property (3) just proved. 
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3. The extension of Theorem 1 to the general (nonstationary) case proceeds with- 
out complications. 
We shall stick to the pattern put forward in § 3b.3. 


THEOREM 2. Let G7 be a ‘chain’ of martingale measures on (B",S™)- 


n>1 a 
markets (here Pra) ~ Prin) Then the condition (Priny) < (PRiny) ensures the 
absence of asymptotic arbitrage. 


Proof. Since conditions a) and c) in Lemma 1 are equivalent, it follows that 


(Piin) < Piin) = limlim P” (Zin) < €) =0, (11) 
n = dP k(n) n = Pr] gn 
where Zin) = dP.) and Piin) = P | J k(n) 


The required absence of asymptotic arbitrage in the case of the contiguity 
(Piin) < (Phin) is a consequence of (11) and Theorem 2 in § 3b. 


4. So far, we have formulated conditions of the absence of asymptotic arbitrage in 
terms of the asymptotic properties of the likelihood ratios ZR(n) (Theorems 1 and 2 
in § 3b) or in terms of contiguity (Theorems 1 and 2 in the present section). Lemma 1 


above suggests a necessary and sufficient condition of the contiguity (Q") < (Q”) 
in terms of the asymptotic properties of the Hellinger integrals of order a € (0,1): 


(Q”) a (Q") <> lim lim H(a;Q",Q”) = 1. (12) 


ald n 


It is often easy to analyze this integral and to establish in that way the absence 
of asymptotic arbitrage. (See examples in subsection 5.) A simplest particular case 
here is that of the direct product of measures. 

Namely, assume that 


E” = ET Ke x Epin) 6" = EP XiX Ekiny 
QP = QE x x Qa), Qh = QP x: x Qh, 
where Qg and Qr are probability measures in (£7, 67’). 


Clearly, 


H(a;Q”, Q”) H(a; QR, QR) 


= [[ [1 ~ 0 — 20a; ag, Q2))] (13) 
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and 


k(n) 


lim lim H(a; Q”,Q”) =1 <> lim lim 1 — H(a:Q?,Q”)) =0. 14 
Hp bee (œ; Q”, Q”) ln mn (œ: Qe. QR) (14) 


Thus, in the case of a direct product we have 


k(n) 


(Q) a (Q") <= lim Tim (1 — H (œ; QR. QR)) = 0. (15) 
a RET 


5. We now present several examples, which on the one hand show the efficiency of 
the criteria of the absence of asymptotic arbitrage based on Hellinger integrals of 
order a and, on the other hand, can clarify the arguments and conclusions of APT, 
the theory described in Chapter I, §2d. Examples 1 and 2 are particular cases of 
examples in [260]. 


EXAMPLE 1 (‘Large’ stationary market with d(n) = 1 and k(n) =n). We consider 
the probability space (Q,-¥,P), where 2 = {—1,1}°° is the set of binary sequences 
x = (x1, %2,- ) (z; = £1) and P is a measure such that P{r: (r1,...,%n)} = 27". 
Let e;(v) = zi, i = 1.2,.... Then € = (€1,€9,...) is a sequence of Bernoulli 
random variables with P(e; = +1) = 4. 

We assume that each (B”, S”)-market defined on (Q, F”, P”) with F” = Fn = 
a(€1,---,€n) and P? = P|¥" has the properties Bf = 1 and S” = (Sj,..., Sn) 
with 

Sk = Sp-1(l+ pe) and So =1, (16) 


where Pk = Hk + one, Ck > 0, and max(—op, ak1) < Hk < ap (ef. condition (2) 
in Chapter V, § 1d). 
We can rewrite (16) as follows: 


Sk = Sp_1(1 +0plEk — bk)) (17) 


where by = —(ttz/o,). (Note that jbg] < 1.) 

It follows froin (17) and Theorem 2 in Chapter V, § 3f that there exists a unique 
martingale neasure, which is a direct product pe = Pr x P? x --+ x Ph and has 
the following properties: the variables ¢€1,€2....,€n are independent with respect 
to this ineasure and 


P%(ep =1)= 50 +b), P(e, = —1) = 50 — by). 


Since 


EA n a _ a 
H(a;P",P") = T] [e oe) aa dF 
k=1 
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it follows by (12) and (15) that 


(P?) < (P") <= lim lim y Il [oer =t] aif 


aaa keiki 2 
n n 
= 1+ by) + (1 — bp) 
= lim Tigi D J] j- GH Oe |=0 
OUND EI kel 2 


Hence it is easy to conclude that 


(P?) a(P") <=> So dR < o. 
k=1 
Uk 


Recalling that 6, = ——~ and using Theorem 1 we see that the condition 
ok 


2 
oO 
Xo (=) < œ% is necessary and sufficient for the absence of asymptotic arbitrage 
k=1 \ 7k 

on our ‘large’ stationary market. 

EXAMPLE 2 (‘Large’ market with k(n) = 1 and d(n) = n). Consider a single-step 
model of (B”, S”)-markets, where B” = (BP), S” = (se, Sh, és oe) with k =0 
or 1, BG = BT = 1, and 


S= aË), S>0, (19) 

where 
° = uo + 0060, (20) 
p= pi toi(ceo + tei), i> 1. (21) 


We also assume that o; > 0, G > 0, 2 + z? = 1, and € = (£0,€1,...) is 
a sequence of independent Bernoulli random variables taking the values +1 with 
probabilities 4. 

With an eye on the theories CAPM and APT, we recommend to interpret S$, 
i > 1, as the price at time k of some stock that is traded on a ‘large’, ‘global’ market 
and sS? as some general index of this market (for example, the S&P500 Index of 
the market of the 500 stocks covered by it; see Chapter I, § 1b.6). 


Cii . 
Let 8; = —,i> 1, and let 
90 


SL. ne pp PO (22) 
00 OiGj 


where |bo| # 1 and |b;| 4 1, 2 > 1. 
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Using this notation, we obtain by (19)-(21) that 
S? = S§(1+ o0(€0 ~ bo)), (23) 


and 
Si = 39 (1+ cici (£0 — bo) + ot (e; — bi)) (24) 
for 2 > 1. 

Sticking to the above-described scheme of series of (B",S")-markets we can 
assume that cach of the markets is defined on a probability space (Q, F”, P”), 
where F” = o(€9,€1,---,€n—1), P” = P| F”, and Q and P are the same as in the 
above example. 

It is easy to see from (23) and (24) that, in the framework of this scheme, for 
each n > 1 there exists (at least one) martingale measure. For we can consider a 
measure P” (having again the structure of a direct product) such that the variables 
£0;£1,+++,€n—1 are independent with respect to pr, namely, 


~ 1 ~ 1 
P(e; = 1) = 5 (1 + bi). and P” (e; = —1) = 5 (1 — bi). 


It is straightforward that 


n-1 
~ 1+ b;)% 1 —5;)@ 
H(a; P”, pr) = II |! + i) + ( i) |: (25) 
: 2 
1=0 
As in the previous example, we conclude that 
A OO 
(P?) a (P") <= D < ©, 
i=0 
so that, by Theorem 2, the condition 
© 2 
> (2) ae (26) 
` TiCi 
t=1 
(in addition to a < l and Hopi — pi <lis 1) ensures the absence of the 
00 OjCj 


asymptotic arbitrage. (Compare formula (26) with (4) in Chapter I, § 2c and (19) 
in Chapter I, § 2d). 


Remark 2. We point out one crucial distinction between the above examples. In 


the first of them, where k < n, there exists a wnique martingale measure pr. which 


, 2 
F ae! X f/u 
allows us to claim (on the basis of Theorem 1) that the condition Y` (=) < œ 
k=1 \ 7k 
is necessary and sufficient for the absence of asymptotic arbitrage. 
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On the other hand, in the second example, where n is the index of a series, the 
measure P” is not unique for n > 1. This explains why (26) is only a sufficient 


condition of the absence of asymptotic arbitrage. (As shown in [260], the condition 
that is both necessary and sufficient has the form lim{min(1 +b;, 1 — bi)| > 0.) 
i 


EXAMPLE 3. We consider a stationary logarithmically Gaussian (Chapter V, § 3c) 
market (B, S) = {(B”, S”), n > 1} with B? = 1 and S” = (Sọ, S1,- -., Sn), where 


Sp = Soeh t the, So > 0. (27) 


We assume that hy = Hk +0kEk, k > 1, where (£1, £2, ...) is a sequence of inde- 
pendent, normally distributed (W (0, 1)) random variables defined on a probability 
space (Q, F, P) and o, > 0, k >21. 

Let Fn = o(€1,.-.,€n) and let Pa = P|Fn, n > 1. It was shown in Chap- 
ter V, §3c that if 


n n 2 
1 
Zn = exp "i Hk Ok Ek += Hk ok : (28) 
2 op 2 2o 2 


then the sequence of prices (Sk)kgn is a martingale with respect to the measure Pn 
such that dP, = Zn dPn; moreover, Law (hy | Pn) = N (k, op), where 


2 
y o 
lik =—Z kon, 


It is now easy to see from formula (10) that 


By (12) we obtain 


(P”) a(P") => D(F) < 00, 


So (He BY co (30) 


ensures the absence of asymptotic arbitrage. 
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Remark 3. If m + ee 0, i.e., 
Ok 2 


Uk>=7-yD > (31) 


then the initial probability measure P is a martingale measure for S = (Sx)x>0- 

Note that condition (30) is also necessary and sufficient for the relation 
(P”) < (P”). Hence this condition is necessary and sufficient in order that the 
sequence of measures (P”) and (P”) be mutually contiguous; we denote this 
property by (P”) <> (P”). 


6. We now discuss briefly the concept of complete asymptotic separability, a natural 
‘asymptotic’ counterpart of the concept of singularity. 


DEFINITION 3. Let Q” and Q’, n > 1, be sequences of probability measures on 
measurable spaces (£",&"). Then we say that (Q”)n>1 and (Q™)n>1 have the 
property of complete asymptotic separability (and we write (Q”) A (Q”)) if there 
exists a subsequence (nz), ng T œ as k T oo, such that for each k there exist a 
subset A"* € Pkr such that Q” (A"k) — 1 and Q™*(A"*) — 0 as k t x. 


LEMMA 2. Let (£",é") be measurable spaces endowed with probability measures 
dQ” 
ig" 


bj 


Q” and Q”, n > 1. Then the following properties are equivalent (here Jes 


= dQ ; and Q” = Ti + an): 


n 
A 


lim Q"(3" > £) = 0 for alle > 0; 


) 
) Tim 
c) imQ"(Z" < N) =0 for all N > 0; 
) lim lim H (a; Q”, Q”) = 0; 
aln if 
e) lim H (a; Q”, Q”) = 0 for all a € (0,1); 
n 
f) lim H (a; Q”, Q”) = 0 for some a € (0, 1). 


n 
Proof. This can be found in (250; Chapter V, Lemma 1.9]. 


7. Our analysis in Examples 1-3 of the cases when there is no asymptotic arbi- 
trage demonstrates the cfticiency of criteria formulated in terms of the asymptotic 
properties of Hellinger integrals of order a > 0. 

For filtered probability spaces (as in Examples 1 and 3) it can be useful to con- 
sider also the so-called Hellinger process: we can also formulate criteria of absolute 
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continuity, continuity, and other properties of probability measures with respect to 
one another in terms of such a process. 

We now present what can be regarded as an introduction into the range of 
issues related to Hellinger processes by means of a discrete-time example. (See [250; 
Chapters IV and V] for greater detail.) 

Let P and P be probability measures on a filtered measurable space 
(Q, F, (Fn)nz0), where Fo = {9,9} and F = V Fn. 

Let P, = P| Fn and Pn =P | Zn be their restrictions to Fn, n > 1, let 
Q = }(P +P), and let Qn = Q| Fn. 

ni ja = en Bn = a , and By = a (here we set 0/0 = 0; 
recall that 3n = 0 if gn-1 = 0 and 3, = 0 if 3,1 = 0). 

Using this notation we can express the Hellinger integral H,,(a@) = H(a; Pn, Pn) 

of order a as follows: 


We set 3n = 


Hn(a) = EQ an aT Er (32) 


We consider now the process Y (a) = (Yn(@))n>0 of the variables 
Yn(@) = dn bn |. (33) 


Let fo(u,v) = u%v!~®, This function is downwards convex (for u > 0 and v > 0), 
and therefore we have 


EQ(¥n(@) | Fm) < Ym(a) (34) 


(Q-a.s.) for m < n by Jensen’s inequality. 

Hence the sequence Y(a@) = (Yn(@),¥n,Q) is a (bounded) supermartingale, 
which, in view of the Doob decomposition (Chapter II, § 1b), can be represented as 
follows: 

Yn (a) = Mp(@) — An(a@), (35) 
where M(a) = (Mn(@), Fn, Q) is a martingale and A(q@) = (Anla), Fn~1,Q) is a 
nondecreasing predictable process with Ag(a) = 0. 

The specific structure of Y (œ) (see (33)) enables us to give the following repre- 

sentation of the predictable process A(a): 


Anla) = J Yp- (a) Ahg (ax) (36) 


k=1 


where h(a) = (hy (@))k>0, kola) = 0, is some nondecreasing predictable process. 
In general, there is no canonical way to define this process. For instance, both 
processes 


a) = D> EQ (1 - BEBIA | Fn-1) (37) 
k=1 
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and 


n 
= J Eg (pa (Bs Be) | Frai) (38) 
k=1 
where Yo (u, v) = aut+(1 ~ ajv- utule, 0 <a < 1, satisfy the above requirements. 
This can be proved by a direct verification of the fact that the process M(a) = 
(Mna), Fn, Q) with 


Mn (a) = Yn(@) + D> Ye-1(@) At (a) (39) 
k=1 


is a martingale for both of them. (See also [250; Chapter IV, § lel.) 


DEFINITION 4. By a Hellinger process of order a € (0,1) we mean an arbitrary 
predictable process h(a@) = (hg (@))ky0, kola) = 0 such that the process M(a) = 
(My(@), Fk, Q)kzo defined by (39) is a martingale. 


n 


zi Z dP 
Remark 4. Assume that P < P, ie., Pa < Pn for n > 0. Let Zn = P and let 
n 
Pn = ) defined by (37) and (38), respectively, 


have the following representations: 


n 


hn (a) = D> Ep(1 — 04-7 |Fr-1) (40) 
k=1 
and 
hn(a) = D> Ep (pall, Pk) | Fx—1)- (41) 
k=1 


Remark 5. We consider now the ‘direct usar scheme’ by setting Q = E1 x E2 x- 
F =E Q620: P=Q1 xQ x: , and P = Qi x Qo x- sc R a, 
are stoba bily i measures on (£;, Ey 

In this case we set hag = 68: On, Pn = Q1 xXx Qn, and Pa = Qix x On: 


Then the property P & P is equivalent to the relation Qn Z Qnr, n > 1, and we set 
dQn 
Pn = FR: 
dAn 
Since Epp, = 1, the right-hand sides of (36) and (37) are the same. The 
corresponding Hellinger process h(a) = (hn (a@)) with 


n 
= yo Ep(1—p, %), n>1, 
k=1 
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is deterministic and 
nm 


hn(a) = X (1 — Ha; Qr, Q4))- (42) 


k=1 
If Hala) = H (a; Pn, Pa), then in our ‘direct product case’ we have 
Hn(a) = Hn-1(a)H (0; Qni Qn) 
= Hn-1(0)[1 = (1 — H (a; Qn, Qn))]. 
Using the notation (42) we obtain 
AHn(a) = -Hn-1 (a) Ahna). 


We already encountered difference equations of this kind in Chapter II (see for- 
mula (11) in § la). We expressed their solution in terms of stochastic exponentials: 


H,,(a) = Ho(a)é(-h(a))_, 


where 
6(-h(a)),, = ew hala) II (1 — Ang(a))eO*(@) 
k=1 
= Ile — Ahx(a)) (- II Hle: Qi y)). 
k=1 k=1 


This agrees completely with the expansion 


Hn (a) = Hola) [| H(a Qr, Qr) 
k=1 


holding in the present case. 

The next results (relating to the ‘scheme of series’) reveal the role of the stochas- 
tic exponential in the issues of the contiguity and complete asymptotic separability 
of sequences of probability measures (P”),51 and (P")n>1 on filtered measurable 
spaces (Q", F”, (FP )pchn))s 2 2 1, with F”? = Fin) and FẸ = {2,07}. 

By analogy with (39) (but modifying our notation in an obvious way so as to 
fall in the scheme of series) we shall denote by 


k(n) 


Hg (@)= D Ea = GD ED | ARa) (42!) 


a Hellinger process of order a € (0, 1) corresponding to the measures P” and pr, 
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LEMMA 3. The following conditions are equivalent: 


a) (P") < (P"); 


b) (PR) < (PH) and 
a Z pn n — 
m lim P (hiin (e) >e) =0 (43) 
for each £ > 0. 


COROLLARY. Consider a stationary case, when P and P are probability measures on 
a filtered measurable space (9, F, (Fk)k>z0)- Let Py =P| Fyn and Py = P| Fy. 
Then, for each N > 1 we have the absolute continuity Py < Py if and ouly if 


Po «Po and hy(a) E 0 as alO. (44) 


LEMMA 4. If 


lim P"(hk,)(3) > N) = 1 (45) 
for each N > 0, then (P") A (P"). 

CoROLLAY. In the stationary case both conditions Po Po and 

P(hy (5) = 00) =1 (46) 


are sufficient for the relation Py lPy. 


Criteria for the properties P <PandP 1P takea particularly simple forr if 
ai n 
we assume additionally that P < P (i.e, Pn < Pn, n > 0). Namely, 


P<P <> PLS ep [(— van)? | Fal <ooh=1, 


k=1 
PL P <> P19 Ep[( - van)? Fam] = co} == 1, 
k=1 
where a, = a 
n 


The proofs of Lemmas 3 and 4 and the corollaries to them can be found in 
(250; Chapter V, § 2c and Chapter IV, § 2cl. 
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8. We proceed now to the proof of the necessity of conditions (9) and (15) in 
Theorems 1 and 2 of § 3b. 

It can be recommended to this end to use the following generalization of the 
concept of contiguity introduced in [260] in connection with asymptotic arbitrage 
on incomplete markets. 

For each n > 0 let (E", 8”) be a measurable space with probability measure Qr 
and let Q” = {Q" } be a family of probability measures Q” in this space. (In what 
follows Q” will always be a family #(P”) of martingale measures.) 

We associate with a family Q” = {Q"} of measures Q” their upper envelope 
sup Q”, the function of sets A € &” such that 


(sup Q”)(A) = sup Q”(A). (47) 
QrEQr 
We shall also denote by conv Q” the convex hull of Q”. 


DEFINITION 5. We say that a sequence of measures (Q™)n>1 is contiguous to the 
sequence of upper envelopes (sup Q”)n>1 (we write (Q”) < (sup Q”)) if for each 
sequence of sets A” € &", n > 1 such that (sup Q”)(A”) — 0 we also have 
QP (A) — 0. 

Assume that M 
_ dQ” 
AG 


dQ 
= dq" ; 
for Q € conv Q”, where Q” = 3(Q + Q”). 
The following result of [260] is a straightforward generalization of Lemma 1. 


3” (Q) Z” (Q) 


LEMMA 5. The following conditions are equivalent: 
a) (Q”) < (sup Q”); 
b) limlim inf — Q"(3"(Q) < €) =0; 
) £10 n e E (3 (Q) ) 
c) lim lim inf Qr(zn > N) =0; 
) Ntoo n Geen ( (Q) ) 
d) limlim sup H(a; Q, Q”) SE 
al0 n Qeconvo” 
9. Proceeding to the direct proof of the necessity of (9) and (15) in Theorems 1 
and 2 in §3b we shall set Q” = P” (= Prin) and, as already mentioned, we take 


the family (P”) of all martingale measures Prin) as Q”, n Èl. 
Clearly, #(P”) is also equal to conv Q” in this case, therefore condition c) in 


Lemma 5 takes the following form: 


“nak dP rin) 
lim lim inf Pem ( = > N) =0, 


N n Dn £ n 
ie Pim EP Prin) 
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Sante : ; iti ; pan č npr : 
which is equivalent to the following condition (since Prin) Prony): 


lim lim f PR (Z  <e)=0 4 
ebm Ze teen Pin) (Zein) < €) = 9 (48) 


where = 
ar a ) 
n p H 
Zin) = {En J: Zin) = gpr + Prin) © PPR). 
Prin) 
Since (48) is precisely the same as (15) in § 3b, it follows by the equivalence of 
conditions a) and c) in Lemma 5 that the ‘sufficiency’ part of Theorem 2 in § 3b 
can be reformulated as follows: the condition 


(Piny) < (sup PRin)) (49) 


means the absence of asymptotic arbitrage. 

Hence (again, by the equivalence of a) and c) in Lemma 5) to prove the necessity 
in this theorem we must show that the absence of asymptotic arbitrage means (49). 

We shall carry out a proof by contradiction. 

Picking a subsequence if necessary we can assume that the sets A” € Fen) 
satisfy the relation R 

(sup Phiny)(A”) > 0, (50) 

but Piin >a>l. 


We claim that we have asymptotic arbitrage in this case. 
For a proof we consider the process 


xX; =  esssup E 


Peasy 


pn (Lan | Fit), k< k(n). (51) 
n k(n) 
EP(PR n) 


By the theorem in §2b, X” = (XẸ¢) is a supermartingale with respect to each 
measure Prin) E P (Pin) and by the theorem in § 2d it has an optional decompo- 
sition 

k 
XP =X +X (of, ASP) - CR, (52) 
j=l 
where Cf = 0, the Cf are ¥/-measurable, and the yẹ are ¥f_,-measurable. 
Based on (52), we define (for each n > 1) a strategy 7” = (6", ge with 8” = 


(BR )e>o and y” = (yk )k>o such that its value xz” is equal to X@ + 3 07 , ASF). 
To this end it suffices to choose Bf and yğ such that 6 + (a. sn) = xg 


(for simplicity we always set Bẹ = 1), to take the yf with k > 1 as in the 
decomposition (52), and to define 6? from the condition of self-financing. 
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For such strategies 7” we clearly have xz” = XP + Cr > 0 for all k < k(n), 
and as n — œ we have 


n 


Xo = sup Esn fan = sup Piiny(A”) +0 
Dn n k(n Dn F n 
Pren EP(PRn)) ( Prin EP PR) 


by assumption (50). 

Thus, conditions (2) and (3) in Definition 1 in §3a hold for the strategy 7 = 
(i sts 

To complete the proof it remains to observe that the strategy 7 = (1) n>1 
satisfies also condition (4) in the same Definition 1 because 


lim P” (Xfin) > 1) > lim P” (Xfi = 1) = lim P"(A") = a > 0. 


This proves the necessity in both Theorems 2 and 1. 


COROLLARY. To emphasize once again the importance of the concept of contiguity 
in the problems of asymptotic arbitrage in models of ‘large’ financial markets, we put 
Theorem 2 in the following (equivalent) form: the condition (Prin) < (sup Pray) is 
necessary and sufficient for the absence of asymptotic arbitrage on a ‘large’ locally 
arbitrage-free market (B, S) = {(B",$"), n > 1}. 


Clearly, for a complete market this is the ‘standard’ condition of the contiguity 
(Pin) < (Prin) (53) 


of the family (Phin))n21 of original probability measures Pin) to the family 


(Pain) nl of martingale measures Pen) (which are unique for each n > 1). 


Remark 6. In Theorems 1 and 2 we formulated sufficient conditions of the absence 
of asymptotic arbitrage in terms of the contiguity (53) to some ‘chain’ of martingale 
measures (Pre): 

It is worth noting that, as shown in [260], the converse result is in fact also true: 
if we have asymptotic arbitrage, then there exists a ‘chain’ of martingale measures 
(Pain nzl satisfying condition of contiguity (53). 


§ 3d. Some Issues of Approximation and Convergence 
in the Scheme of Series of Arbitrage-Free Markets 


1. In the models of ‘large’ financial markets discussed in §§ 3a,b,c we always 
assume that there exists a scheme of series of n-markets (B",S"), n > 1, each 
of which is arbitrage-free; after that we consider the question of the absence of 
asymptotic arbitrage. It should be noted that we make there no assumptions about 
the existence of a ‘limit’ market (B, S). 
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In the present section we discuss the case when, besides ‘prelimit’ n-markets 
(B”, S”) defined on the probability spaces (Q7, F”, P”), n > 1, we also have a 
‘limit’ market (B, S) defined on (Q, F, P) and the (weak) convergence 


Law(B", S” | P”) + Law(B, S |P) (1) 


as n > OO. 
We are mainly interested in understanding (under the assumption (1)) the issue 
of the convergence 


Law(B”, S” |P”) > Law(B, S |P), (2) 


where P” and P are some or other martingale measures in the classes (P”) and 
Y(P), respectively, and in finding appropriate measures P” and P such that the 
convergence (2) can be ensured. 

It seems timely to recall in this connection that we have already come across 
various methods of the construction of martingale measures, based, for example, on 
the Girsanov and the Esscher transformations. We also recall that the concept of 
minimal martingale measure discussed in Chapter V, § 3d, has been developed (in 
works of H. Féllmer and M. Schweizer, e.g., [167] and [429]) just in connection with 
the question on martingale measures in “(P”) ‘eligible’ to enter the construction 
of chains of measures (P")n51 used for financial calculations. (It seems appropriate 
to point out here that, actually, one uses martingale measures P? and P rather 
then the original--- physical, as one puts it sometimes—measures P” and P in, say, 
pricing of hedges or rational option pricing; see, e.g., the main pricing formula for 
European-type hedges on incomplete markets ((8) in § 1c) or formula (20) in § 4b.) 


2. Proceeding to a discussion of the above questions we should recall the follow- 
ing two ‘classical’ models of (B, S)-markets distinguished by their simplicity and 
popularity in the finances literature: 

the Cor—Ross—Rubinstein model 
(or the binomial model; see Chapter II, § le) in the discrete-time case and 

the Black-Merton—Scholes model 
(or the standard diffusion model, based on geometric Brownian motion; see Chap- 
ter III, § 4b) in the continuous-time case. 


As is well known, the first of them is (for a small time step A > 0) a satisfactory 
approximation to the second, and pricing (of standard options, say) based on the 
first model yields results close to those obtained for the second model of a (B, S)- 
market, in which 


o2 
Bi = Boe™ and S; = Soet Feros, (3) 


where W = (W¢)y0 is a standard Wiener process. 
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In line with the énvariance principle well known in the theory of limit theo- 
rems (see, e.g., [39] and [250]), a Wiener process can be a result of a limit tran- 
sition in a great variety of random walk schemes. It is not surprising therefore 
that, for instance, in binomial models of (B", S")-markets (with discrete time step 
A = 1/n) defined on some probability spaces (Q”, F”, P”) we have convergence to 
the (B, S)-model of Black~Merton-Scholes in the sense of (1) as n — œo, where P 
is a probability measure such that W = (W:)¢y0 is a Wiener process with respect 
to P. 

Note that the existence of the convergence (1) for these two ‘classical’ models as 
well as for other models of financial markets is, as we see, only a part of the general 
problem of the convergence of ( B”, S")-markets to some ‘limit’ (B, S)-market. An 
equally important question is that of the convergence as n — œ of the distributions 


Law( B”, S” | P”) — Law(B, S |P), (4) 


where P” and P are martingale (risk-neutral) measures for (B", S”)- and (B, S)- 
models, respectively. 

Here one must bear in mind the following aspects related to the completeness 
or ¿incompleteness of the (arbitrage-free) markets in question. 

If (B", S")-markets are complete, then (at any rate, if the Second fundamental 
theorem holds; see Chapter V, § 4a) each collection Y(P”) of martingale measures 
contains a unique element, the question as to whether (4) holds under condition (1) 
is connected directly to the contiguity of the families of measures (P”) and (P”), 
and it can be given a fairly complete answer in the framework of the stochastic 
invariance principle, which has been studied in detail, e.g., in (250; Chapter X, § 3]. 

However, the situation is more complicated if the arbitrage-free ( B”, S”)-markets 
in question are incomplete. 

In this case the sets of martingale measures (P”) contain in general more than 
one element and there arises a tricky problem of the choice of a chain (P™)n>1 of 
the corresponding martingale measures ensuring the convergence (4). 

In accordance with the definition of weak convergence we can reformulate (4) as 
the limit relation 


Es, f(B", 8") > Es f(B,S) (4) 


for continuous bounded functionals in the space of (cadlag) trajectories of pro- 
cesses under consideration (which, in the above context, are usually assumed to be 
martingales). 7 
dP” dP 
Note that if Z” = dpa and Z = JP’ then 
Es, f (B”, S”) = Epn Z” f( B”, 8”) 


and 


Es f(B,S) = EpZf(B, S), 
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and therefore, clearly, (4’) is most closely connected with the convergence 
Law(B", S”, Z” |P”) > Law(B, S, Z |P), (5) 


relating to the ‘functional convergence’ issues of the theory of limit theorems for 
stochastic process. See [39] and [250] for greater detail. 

It is worth noting that the convergence (4’) follows from (5) if the family of 
random variables {Z" f(B”, S”); n > 1} is uniformly integrable, i.e., 


slim, Ti Eps ([2"4(B", S")|1(\Z"s(B", 8")| > N)) =o. 


3. As an example we consider the question whether properties (1) and (2) hold 
for the ‘prelimit’ models of Cox-Ross-Rubinstein and the ‘limit’ model of Black- 
Merton-Scholes, which are both arbitrage-free and complete. 

Let (Q", F”, P”) be a probability space, and assume that we have a binomial 
(B", S”)-market with piecewise-constant trajectories (Chapter IV, § 2a) defined in 
this space in accordance with the ‘simple return’ pattern (Chapter II, § 1a): for 


O0<t<1,k=1,...,n, and n > 1 we have 
[nt] 
BP = BS [| (+r) (6) 
k=1 
and 
[nt] 
st = 80 [[ (+ 68), (7) 
k=1 


where the bank interest rates are 


r 
rh = a , F 2 0, (8) 
and the market stock returns are 
Th H n 
Pk = + Sk H2O. (9) 
In the homogeneous Cox-Ross-Rubinstein models the variables €7,..., én are 
independent and identically distributed, with 
pi(ge Vie and py aos |) ag (10) 
k Jn k yn , 


where a, b, p, and q are some positive constants, p + q = 1. 
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By (9) and (10) we obtain 


1 
Epn py = E + ae — qa), 
1 1 
Epn (pp)? = = (pb? + qa”) + o(sn). 


and 


Pq 1 
Dpr p? = = (b +a)? +o(z} 


For sufficiently large n we have 1 + pẹ > 0 and 
[nt] 

SP = stew S In(1 + on} 
k=1 
[nt] 


-seof (2-4) o) 


k=1 


Assume that 


Then, setting 


2 
H 2 o 1 o 
Eais i Epa = T o(a) Dete 5, 


we obtain 
[nt] ( n\2 2 
Pr) 
$ Epe ok 5 | (e a 
k=1 
[nt] 
D Dpn cE = (ok) | ot 
k=1 
as n > œ. 


The following Lindeberg condition is clearly satisfied in the above case: 


n 
im J Epn fixe + pe)? L(|In(1 + R) > e) | =0 for ¢>0. 
k=1 
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(13) 


(14) 


(15) 


(16) 


(17) 


(20) 
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Hence it follows by the functional Central limit theorem ([250; Chapter VII, Theo- 
rem 5.4]) that (as SẸ —> So) 


Law (SP; t < 1|P”) + Law (Ss; t <1{P), (21) 
where 


o? 
S= Soop (n- T )e+ om} (22) 


and W = (W:)z<1 is a Wiener process (with respect to the measure P). 
Thus, if Bf —> Bo, then 


Law(BP, SP; t < 1|P") + Law (Br, St; t < 1|P), (23) 


i.e., we have convergence (1). 
We turn now to an analogue of property (23) in the case when P” and P are 
replaced by martingale measures P” and P. 


If 
~ b oe Dn n ~n 
d = <=) =p", P (a = -7) F 


for k = 1,...,n, and the variables £7, ...-, éx are independent with respect to Pr, 
then the martingale condition 


SE k-1 
v as i 
En (pe | Ft.) oe pies 


brings us to the relations 


which are already equivalent to the condition 


r-u 


bp” — ag” = l 24 
p” — ag T (24) 
Taking account of the equality p” + q” = 1, we find that 
e a l r- 
p” = pte 
a+b /na+t+b 
(25) 
oP b l r-u 


a+b yna+b` 


(Cf. Example 2 in Chapter V, §3f, where the uniqueness of the martingale mea- 


sure P”? and the independence of €7,...,€7 with respect to this measure were also 
Pp 1 Sk p 
established.) 
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Hence 


where g? = ab. 
Note that the conditions p + q = 1 and pb — ga = 0 mean that ab = pb? + qa’. 
In other words, c? = 7? and therefore 


[nt] ny2 2 

n (PR) 
X EBn ot > | > (« 5 Je 
k=1 


[nt] 2 
5y Dgn E -— i) | ert 
k=1 


The Lindeberg condition (20) withstands the replacement of the measure P” 
by P”. Hence using the functional Central limit theorem (as Sf -> So) we obtain 


Law (SP; t < 1|P”) > Law(S;; t< 1|P), (26) 


o? eme 
Ses Soep] (r- Tito), 


W= (Wi)e<1 is a Wiener process with respect to P, which is a unique martingale 
measure existing by Girsanov’s theorem (Chapter III, § 3e): 


5 = l/u-r\2 
aP = exp} fm -5(4*) hap. 


HoT 


where 


In addition, W; = W; + t. 
o 
Thus, we have the following result. 
THEOREM. Jf the parameters a > 0, b > 0, p > 0, and q > Q in the ‘prelimit’ 
Cox—Ross-Rubinstein mnodels defined by (6)—(10) satisfy the conditions pb — ga = 0 
and p+ q = 1, then one has the convergence (21) and (26) to the ‘linit’ Black- 
Merton—Scholes models. 
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4. We now modify slightly the ‘prelimit’ Cox-Ross—Rubinstein models, dropping 
the restrictive condition pb — ga = 0 so as to retain the right to use the functional 
limit theorem. 


To this end we assume that the variables p? are still defined by formulas (9) for 
odd k = 1,3,..., whereas for even k = 2,4,... we have 


Pk = — +7: u >00, 


where 


Then for k = 2,4,... we obtain 


Epnpp = 


Epa (PR)? = 
p 1 
Dpr of = Ma b)? + o(a) 


Hence, in view of (11)-(13) (for odd k), 
[nt] 
p 
X Ep» lok l 
k=1 


[nt] n\2 
DL Dpn lag — | ~ ot, 
k=1 


N Prd 
n 
N 
| 
4 
MES 
T 
N zE 
See 
a 


where 
ot = pala + b)?. 


Thus, in our case of the inhomogeneous Cox-Ross-Rubinstein model we obtain 
the same result as in the homogeneous case; namely, 


Law (Sf; t <1|P") => Law(S; t <1|P), (27) 
where S = (S;)¢<1 is a process defined by (22) with o? = pq(a+b)?. 


Now let P” be martingale measures such that the variables pi, ..., pp are inde- 
pendent again and 


D b an pn{ en on 
meg) Pei) 
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for odd k, where pf = p” and @f = q” and 
oe b PN a ps 
M(w=-p)=a P=) 


for even k with pf and q{° defined (due to the martingale condition) by the formulas 
Pg =p” and qf = q” with 


an_ a l r-u 

a+b yna+b'’ 

an _ b 4 l r-e 

a+b /na+b 
Then for all k = 1,...,n we have 
Esn Pk = 


and ; ; 
a 
Den Pk = +0(s5) 
Consequently, 
[nt] 2 a2 
(Pk) 4 
5 Efn | E~ 3 > |r- t, 
k=1 
[nt] 2 
(k E 
De Dgn lor ~ z | at; 
k=1 


where Ẹ? = ab, and in view of the Lindeberg condition (which is still satisfied) we 
see that (as SË — So) 


Law (SP; t < 1|P") > Law(S; t <1|P), (28) 


where 
G2 tee 
Se= e 


and W = (Wider is a Wiener process with respect to the measure P such that 


p z E E 
dP = exp im = (4) ap. 
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We point out that, in general, ab 4 pq(a + 6)?, and therefore 7 4 07. Hence 
if the ‘limiting’ model has volatility o? and the paranu ters a>0,b>0,p> 0, 
and q > 0 satisfy the equalities p + q = 1 and g? = pq(a t b)?, then we have the 
functional convereence (27); however the ‘limit’ volatility ¢ €? in (28) can happen to 
be distinct from o? (if we drop the ‘restrictive’ condition pb — aq = 0). 

This example of an inhomogeneous Cox—Ross—Rubinstein model shows that in 
choosing such models as approximations to the Black-Merton-Scholes model with 
parameters (p, a?) one must be careful in the selection of the parameters (p, q,a, b) 
of the ‘prelimit’ models, since even if there is convergence (27) with respect to the 
original probability measure, the corresponding convergence (to the Black—Merton— 
Scholes models with parameters (u, a?)) with respect to martingale measures may 
well fail. This, in turn, means that the rational (hedging) prices C? in the ‘prelimit’ 
models do not necessarily converge to the (anticipated) price Cy in the ‘limiting’ 
model. 

Clearly, a similar situation can arise in the framework of other approximation 
schemes. 


5. In connection with the above cases of the convergence of the processes 
S” = (SP)ic1 to S = (St)eg1 — both with respect to the original probability 
measures (P” and P) and with respect to the martingale measures (Pr and P)—it 
seems appropriate to present several general results in this direction. 

Assuming for simplicity that BP = 1 and By = 1 fort < 1 and n > 1, we 
observe first of all that the weak convergence of the laws Law(S” |P”) does not 
imply the weak convergence of Law(S” | Pr) even in the presence of the contiguity 
(P”) < (P”), although the sequence (Law(S"|P"))n>1 is nevertheless tight (see 
the monograph [250; Chapter X, §3]), and therefore, in general, can have several 
limit measures (corresponding to different subsequences). 

One standard trick ensuring the uniqueness of the limit probability measure is 
that, besides the condition of the weak convergence of the laws Law(S” | P”) and the 
contiguity (P") < (P”), we assume the weak convergence of the joint distributions 


n 


d 
Law(S”, Z” |P”), where Z” = (Zp ):<1 and the Z? = = are the densities of the 


pr with respect to the P? (here pr and P? are the restrictions of P? and P” to 
the o-algebra F? from the stochastic basis (0", F”, (FP )i<1, P”) underlying the 
processes S” = (SP )¢<1). 

Then it follows from the generalized version of so-called LeCam’s third lemma 
(see [250; Chapter X, Theorem 3.3]) that the sequence of laws Law(S", Z” | P?) con- 
verges weakly to some probability measure that is absolutely continuous with respect 
to the measure that is the weak limit of Law(S”, Z” | P”), n > 1. If, in addition, 
Law(S”, Z” |P”) => Law(S, Z |P), then Law(S”, Z"|P") — Law(S, Z |P), where 
dP = ZaP. 
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6. For illustrations of these results we turn again to the Cox~Ross—Rubinstein 
model (6)~(7) considered above, set for simplicity r = 0 and BẸ = 1, and put 
this model in the form used already in the construction of the minimal martingale 
measure in Chapter V, § 3d.7 (see also [392]). 
k 
Let H = Do pp. If pb- qa = 0, then the sequence M” = (MẸ )kgn, where 
tl 


k 
Mr = >> &ř, is a square integrable martingale with quadratic characteristic 
1. 


Setting af = /o” we can write the Doob decomposition ( Chapter II, § 1b) of 
H” = (HP )een in the following form (in terms of the increments): 


AH? = af A(M"), + AMP. (29) 


Assume that bj: < o°. Then the (minimal) measure P? such that 


n 
dP” = [] (1 -ag Amp) dP" (30) 
k=1 
(cf. formula (31) in Chapter V, § 3d) is a probability measure and, moreover, a 
(unique) martingale measure for the sequence H” = (H?’)x<n, as follows from 
Chapter V, § 3d.7 and can be verified directly. 
We set 


[nt] 


Zp = [] (1- af Amp) = a(- 
k=1 


Yog amg) (31) 


ks: [nt] 
and represent S;’ as follows: 
[nt] 


sp = S8 [| (1+ ABE) = SBE 
k=l 


= s6(3> at A(M™), + m”) ; (32) 


k [nt] 

Let also M = (M:)ici be a square integrable martingale (on some stochas- 

tic basis (Q, F, (Ft)t<1: P)) with quadratic characteristic (M) = ((M)t)e<1, let 

a=(at)ic1 bea predictable process with a (M) <x (ie. with g a? d(M)i< ~) 5 
let ; 

He i as d(M)s + M, (33) 
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and let 


Z= e(- f as ams) , St = So (H). (34) 


The structure of relations (31)—(34) suggests the conditions that one must im- 
pose on these processes to obtain the weak convergence 


Law(Sf, Zf; t <1|P") > Law (St, Ze; t < 1|P). (35) 


Thus, if a” = (aP)ic1, M” = (MP )ici and (M”) = ((M™)t)eci are piecewise- 
constant processes constructed from (a%)k<n, (ME)ecn, and ((M”}k)kgn then it 
is clearly sufficient for (35) that the distributions 


[nt] [nt] 
Law (4, 2, ob AME, Dap A (M™),: t sije") 


k=1 


t t 
taw (m | as dMs, | asd(M)s; t< 1 | P) 
0 0 
and Sg =% So. 


Various conditions ensuring such convergence of martingales and stochastic in- 
tegrals can be found, e.g., in [250; Chapter IX] and [254]. In particular, it follows 
from theorems 2.6 and 2.11 in [254] that the required convergence occurs once 


converge to 


Law(MP, af; t <1|P") > Law(M:, ar; t < 1|P) 
and the jumps of martingales satisfy the following condition of uniform smallness: 
sup Epn | sup jam] <00. 
n t<1 
This clearly holds for the model (6)-(7) and (as seen from subsection 3) we can 
take as the limit M = (Mt)t<1 the process with M; = oW; and o? = pb? + qa’, 
where W = (Wi)tci is a standard Wiener process. Then Hi; = pt + oW; and 


Sy = Sob (H). Since d&(H): = (HJ dH;, it follows that dS; = Silu dt + o dW). 
Hence, as one would expect, the process S = (St)tg1 is just a geometric Brownian 


motion: 2 
SES Sy exp (u- Strom. 


We also see from (34) that 


LA = epf -Ew — (EYA 
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In the present case we have the contiguity (Pn) < (P”), which can be proved as 
in Example 1 in §3c.5. Hence it follows from the above-cited generalized version of 
LeCam’s third lemma that 


Law (SP; t < 1|P”) > Law (St; t < 1|P), 


where dP = ZıdP. = E 
Since, by Girsanov’s theorem ( Chapter II, §3e), the process W = (Wiigi 


with W: = W: + at is Wiener with respect to the measure P, it follows that 
o 


Law (ut +0oWr; t < 1|P) = Law (oW; t 


<1|P) 
= Law (oW; t< 1|P 


). 
Hence 7 
Law(St; ¢ <1 IP) = Law (Spe Ttt We, t< VP); 


which has been already proved in subsection 3 by a direct application of the func- 
tional Central limit theorem (see (26)). 


4. European Options on a Binomial (B, 5)-Market 


§4a. Problems of Option Pricing 


1. Following a long-established tradition in finance and in accordance with the 
nomenclature in Chapter I, §la we distinguish two kinds of financial instruments 
and, in particular, securities, 

basic (primary) 
and 

derivative (secondary). 

We discussed basic securities (stock, bonds, currency) at length in the first chap- 
ters. We considered various models of their dynamics, and the results of statistical 
analysts revealing such phenomena in the behavior of financial data as the cluster 
property, fractality, long memory, and some other. 

We have also paid much attention to the theory underlying derivatives pricing, 
which is based upon the concept of arbitrage-free, ‘fair’ financial market. 

As pointed out in Chapter I, not only speculators may develop an interest in 
derivatives. Importantly, they play the role of hedging instruments, which protect 
one from financial risks incurred by uncertainties of the price development. 

For instance, if the current price of a share in corporation A is Sg = 100 and the 
investor anticipates its growth (S1 = 120), then he can buy a share (at time n = 0) 
and then (at time n = 1) sell it pocketing the profit of $1 — Sg = 120 — 100 = 20. 

Of course, the price can also drop (Sọ = 100 | Sı = 80), and then selling the 
share will bring losses: S1 — Sg = 80 — 100 = —20. 

Thus, we have two possible patterns of developinent: 


Sı = 120 
So = 100 
Sı = 80 
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and ‘large’ gains (= 20, i.e., 20% of the price So = 100) are accompanied by a 
‘large’ risk of losses (= ~20, i.e., 20% of Sp = 100). 

Besides the above strategy (of buying and selling basic securities), an investor 
can also look at the derivatives market. For example, he can buy a call option with 
maturity time n = 1, which gives him a right (see Chapter I, § 1c) to buy a share 
at time n = 1 at a price of K = 100, say. 

In this case, if Sg = 100 and Sı = 120, then the investor buys the share at the 
price K = 100 (fixed in advance) and immediately sells it (on the so-called spot 
market?) at the market price Sı = 120, pocketing the profit of (Sı — K)* = 20. 

Of course, for buying this option that grants the right of a purchase at a fixed 
price (K = 100) one pays certain premium to the writer. Assume that this premium 
is Cy = 10. If the stock moves up (So = 100 f Sı = 120), then the buyer’s net 
profit is 10. On the other hand, if the prices drop (Sọ = 100 | S1 = 80), then the 
buyer does not exercise his option (there is no seuse in the purchase of a share at 
the price K = 100 when one can buy it at the lower market price Sı = 80), and 
takes losses in the amount of the premium (= 10) paid for the option. 

Hence, the purchase of a call option (which is just one kind of derivatives) 
reduces the risks of au investor (he can now lose only 10 in place of 20 ‘units’), but, 
of course, his potential profits have also slimmed (10 ‘units’ in place of 20). 

Thus, we can say that the strategy of a speculator anticipating a rise of price (of 
a ‘bull’, using the terrm of Chapter I, § 1c) is better insured against possible losses 
if it is based on the purchase of a call option than when it is based on transactions 
involving stock directly. 

We consider now a ‘bear’, a speculator who anticipates a drop of prices (the 
price of a share, in our case). In principle, it is not against the rules on many 
markets to sell stock one does not have at the moment. Assume that our ‘bear’ 
undertakes to sell stock at time n = 1 and the corresponding exercise price is 100 
‘units’. If the price Sı drops to 80, in line with the ‘bear’s forecasts, then he will 
buy a share at this (market) price S1 = 80 and take a profit of 20 ‘units’. However, 
if S1 = 120, then his losses will be 20 ‘units’. Again, both huge profits and huge 
losses are possible. 

Similarly to the ‘bull’, the ‘bear’ can turn to the derivateves market. For in- 
stance, he can buy a put option (for 10 ‘units’, with K = 100), which gives him 
the right to sell the share (which he does not necessarily have at the moment) at 
the price K = 100. Later on, if Sı = 80, then he buys a share at this market 
price and sells it for K = 100 ‘units’, as stays written in the contract. Allow- 
ing for the prernium, his net profit is 20 — 10 = 10 ‘units’ if prices actually drop 
(So = 100 | S; = 80). On the other hand, if they rise, then the ‘bear’ takes losses 


°The following terms are often used in the finances literature (see, e.g., [50}): deals 
(contracts) providing for a delivery or certain actions at some moment in the future (op- 
tions, futures, forwards, etc.) are usually said to be forward, while the ones including 
instant delivery are said to be spot. 
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of 10 ‘units’. Hence, the purchase of a put option reduces speculative risks, but 
reduces accordingly also the possible speculative gains. 

The above figures are fairly arbitrary; nevertheless, our illustration of the ‘spec- 
ulative’ and ‘protective’ functions of derivatives is adequate on the whole. 


2. One cardinal issue relating to options is the value of the fair, ratzonal premium 
paid for the purchase of an option contract. This is important for the buyer and 
also for the writer, who sells the derivative and must use the premium to provide 
for his ability to meet the terms of the contract. Of course, the writer of options is 
also interested in the assessment of the overall profits or losses from their flotation 
on the market. 

Note that a pricing theory for some or other derivatives must be built upon 
concrete models describing basic derivatives and general assumptions about the 
structure and the mechanisms of securities markets. The simplest in this respect is 
the (B, S)-market described by the binomial CRR-model of Cox—Ross—Rubinstein 
(see Chapter II, §1e). For all its simplicity, in its analysis one can more easily 
understand general principles and find examples of pricing based on the concept of 
‘absence of arbitrage’. Our discussion will be built around options, both for their 
own Sake and because many problems related to the markets of other derivatives can 
either be reformulated in the language of options or benefit by the well-developed 
techniques of option pricing, which is based on the plain and fruitful idea of hedging. 


§ 4b. Rational Pricing and Hedging Strategies. 
General Pay-Off Functions 


1. In the CRR-model of a (B,S)-market formed by two assets, a bank account 
B = (Bn) and a stock S = (Sn), one assumes that 


AB, = rBn-1, (1) 
AS, = PnSn-1, 


where (pn) is a sequence of independent random variables taking two values, a and 
b,a < b, and r is the interest rate, —1 <a <r <b. 

Moreover, we assume that the sequence p = (pn) of variables defined on the 
underlying filtered probability space (Q, F, (Fn), P) has the property 


P(n =6b)=p and P(p,=a)=4, 


where p+ q= 1,0 < p,q < 1, and the p, are ¥,-measurable for each n. 

All the randomness in this model is due to the variables pn, therefore, we can 
take as the space Q of elementary outcomes either the space Qy = {a,b} of 
finite sequences z = (z1,72,...,2N) such that £n = a or Tn = b (if n < N) 
or the space Qoo = {a,b} of infinite sequences z = (x1,272,...) with £n = a,b 
(ifn € {1,2,...}). Then pn(z) = rp and since both spaces Qy and Qa are discrete, 


4. European Options on a Binomial (B, S)-Market 591 


probability measures Py and P on the corresponding systems of Borel sets are 
completely defined by their finite-dimensional distributions Pp = Pp(z1,..-,2n), 
where n SN or n < co. 


n 
If w (z1,..-;,£n) = >> Ip(z;) is the number of the components z; equal to b for 
i=1 


i < n, then, obviously, 
Pr(r1, ah In) = pre(Fis-in) gh Yo (15-4En) (2) 


Putting this another way we can say that Py is equal to Q@--- @ Q, the direct 
— ee” 
n times 

product of the measures Q such that Q({b}) = p and Q({a}) = q, where p > 0, 
q > 0, and p+q = 1. As shown in Chapter V, § 1d the CRR-model is arbitrage-free 
and complete, while, by the First and the Second fundamental theorems, for each 
n> 1 there exists a unique martingale measure Pa ~ Pn, which has the following 
simple structure (cf. (2)): 


P,(r1, ee tn) = Pre(Prrertn)GN— Mb (T1y-Tn) (3) 
where b 
ae r—a ad eae 
Da a We (4) 


We see from (3) that Pn, like Pp, has the structure of a direct product: 
Pn = Q@---@Q, where Q({b}) = p and Q({a}) = ¢. 
n times 


2. We shall now consider European options of maturity N < oo with pay-off (con- 
tingent claim) fy depending, in general, on all the variables Sọ, S1,..., SN, or, 
equivalently, on So and p1,...,9n- (See Chapter I, §1c for various concepts re- 
lated to options.) 

We have already mentioned that both writer (issuer) and buyer of an option 
contract come across the central problem of the correct definition of the ‘fair’ (‘ra- 
tional’) price of this contract. 

In accordance with Chapter V, § 1b, if a market is complete and arbetrage-free 
(as is our binomial (B, S)-market), then as a fair price, one can reasonably regard 
the following quantity (the price of perfect European hedging): 


C(fn;P) = inf{x > 0: Jr such that Xf = x and XR = fy (P-as.)}, (5) 


where X” = (XP )ocncn is the value of the self-financing strategy 7 = (8, y). (See 
Chapter V, § 1b and § 1b in the present chapter for greater detail.) 
Moreover, one can calculate C(fyy;P) by formula (4) in § 1b; namely, 


C(fn; P) = BÈIN l (6) 
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where E is averaging with respect to the martingale measure Py. 
For the model (1) we have By = Bo(1+1r)%. Hence we obtain there 


fn 


C(fn;P)= faa oN’ 


(7) 
and, in principle, this gives one a complete answer to the question on the rational 
price of an option contract with pay-off fy. 

Remarkably, the option writer in this model, on taking the premium C(fy;P) 
from the buyer, can build a portfolio 7 = (5, y) of value Xt = (X®) nen replicating 
faithfully the pay-off fy at the instant N. As mentioned, e.g., in § 1b, one standard 
way of finding this portfolio 7 = (3, y) is as follows. 

We consider the martingale M = (Mn, Fn, Py)n<N: where 


Ma =E( | Sn). 


S : ; eed 
In view of the ‘= -representation’, there exists a predictable sequence 7 = (Vi)igN 
such that 


n 
x Sk 
= y 2k <N. 
Mn m+ Y nale): nı N (8) 


Setting Br = Mk — “EP 


we obtain (see Chapter V, § 4b and § 1b in the present 


chapter) a self-financing hedge 7 = (3, y) of value 


XË = By, Br + FkSk = oye ( 7 | Fa) 


such that S 
XG = C(fn;P) (9) 


and we have the property of perfect hedging: 


XR = fy. 
Since * Spa r) 
a() -g= w 
it follows by (8) that 
n n 
Mn = Mo + Y aP (ok - r) = Mo + PaP AmA, (11) 


k=1 k=l 
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(P) 


where a,’ and Yẹ are connected by the equality 


a a”) Br 
| a as 
k Sk-1 


and the sequence ml?) = (mi?), Fn, Py)ncn of the variables 
) n 
mP = S (o-r) 
k=1 


is a martingale. 
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(12) 


(13) 


As will be clear from what follows, it makes sense to consider alongside the 


sequence p = (fn) also the sequence ô = (ôn) of the variables 


Clearly, 


and Fn = a (P1; -- -> Pn) = o (f1, - -3 fn) 

Since 
Pea? 
b-a’ 


ôk P= 
it follows that, in addition to (8) and (11), we have also the representation 
ô ô 
Mn = Mo + 92am), 
k=1 


(ô) 


) = (mp Fn, Pn) of variables 


where the sequence mi 


is a martingale and 


We sum up the above results as follows. 


(14) 


(16) 


(17) 


(18) 


(19) 
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THEOREM 1. 1) In the framework of the CRR-model (1), for each N and each 
#n-measurable pay-off fy the fair price C(fy; P) can be described by the formula 


ÍN 


C(fn;P)= fGen’ 


(20) 


where E is averaging with respect to the martingale measure Py. 
2) There exists a perfect self-financing hedge 7 = (3,7) of value Xt = (REV sce 
such that E z 
Xp =C(fn:P) XN = fn 


xt = E(t | Fn). (21) 


and 


3) The components B = (Bn) nen and y = (Fn)n<n Of the hedge 7 satisfy the 
relation 


~ Ans: 
Bn = Ma- p’ 


z ; S À 
where yn, n < N, can be determined from the ‘B Tepresentation’ (8) for the 


martingale M = (Mn, Fn, Px) n<Nn with 


My =E( £ | sa). 


3. As seen from the statement of this theorem, finding a perfect hedge 
mw = (8,7) is intimately connected with the representation of the martingale 
M = (Mn, Fn, PN)ngN in one of the equivalent forms (8), (11), or (17). The next 
result concerns one interesting case of such a representation. 


THEOREM 2. Assume that a pay-off function is as follows: 
Ín = Bn g(An), (22) 


where g = g(Ayn) is a function of Ay = 6, +-+ ôN. 
Then the coefficients in the representation (17) are 


a =Gy_e(Ag-1iB), SES N (23) 


where Ag = 0 and 


Gn(2;P) = X [g(a +k +1) -— g(z + k)] CR pe Gg". (24) 
k=0 
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Proof. First of all, we note that Mn = E(My | Fn). where My = —. 


Since AM, = al? 8) Am, the value of ava aO, Sn) can be ex- 
pressed as follows: 


~(6) _ E(Mw | 61,---.5n—1, 1) ~ Ë(My | ôn- 5n—1) 
Th 1~p 
E(g(Ay) [Op seas ôn=1;1) — E(g(Aw) | 6,. Senet) 
1-9 ; 


(25) 
In the set {w: An—1 = T, bn = 1} we have 
E(g(Aw)| Fn) = Eg(2+1+An ~ An) 
and 


E(g(An) | Fn-1) = E g(x + Ay — An-1) 
= pEg(x+1+ Ay —An)+ (1-3) Eg(z + Ay ~ An). 


Hence we obtain there 


E(g Eir us E(g(An)| Fn-1) 
mE eer An) — 9(z + An — An)] 


g(x@+1+k) —~g(x+k)|CK_,p* (l~p ay k 


ing 


which, in view of (25), brings us to the required representation (23). 


§4c. Rational Pricing and Hedging Strategies. 
Markovian Pay-Off Functions 


1. We shall now assume that the pay-off function fy has the ‘Markovian’ form 
fn = f(Sn), where f = f(x) is a nonnegative function of x > 0. 
Let 
Xn =E[F(Sw) +r) | Fa] (1) 


be the value of a perfect hedge 7 at time n; in particular, 


C(fn; P) = XẸ =E[f(Sv) tr) ^]. (2) 
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We set 
n 
Fa(z;p) = y f(z( + b)*(1 +a)? E)E p (1 — pE. (3) 
k=0 
We have 
[I (1+ px) = (1 + b)48~4"(1 pa) NT- (An-An) (4) 
n<k<oN 
, s _ P-a 
with A, = 6, +°: + ôn for 6, = pa’ and therefore 
Er(s I] (1+ ox)) = Fy_n(2;p) (5) 
n<ksN 


T=a 


with p = 


—a 
Taking finally into account the equality 


SN=Sn [| (1+ pr), (6) 
n<ksN 


we obtain by (5) the following result. 


THEOREM 1. The value X™ = (X™)ncn of a perfect hedge ï in the CRR-model 
with Markovian pay-off function fy = f (Syn) can be described by the formulas 


XR = (1+ r) OM Ey-n(Sn P). (7) 
In particular, the rational option price is 


C(fv;P) = XË = (1 +r) ^ Fy (S033). (8) 


We set 2 ER 
a e a o 


Then, by Theorem 2 in the preceding section, the coefficients in the representa- 
tion 


N 
My = Mo + Yaf (5x — P) 
k=1 
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for 7 
My Ñ Sn) 
NT By Bn 
are predictable functions: 
ô 
al?) = Gn_;(Ai-1:p), 
where 


N-i 
Coie 5 V ch pra -pni 
k 
e£+k+1 
x (s0 +0" (1) )-s(20+0"( 


1+) gt+k 
ss) ny) 
Setting here x = A;_, and taking account of the equality 


=0 


Ai- 
LS 
pes oa 1 i—l EEAS 
Sy-1 So( +a) (=) 
and notation (3), we obtain by (11) that 


TEE By [Ew-:(Sim l +0); 5) — Fyi(Si-1(1 +a); P)]. 


Now, note that, by formulas (12) and (16) in § 4b, 


(12) 
x% — a Bi _ GN-ilâi-1;P)Bi (13) 
We Seb a) S;-1(b — a) 
By (12) and (13) we obtain 
es (N-i), Fn-i(Si-1 0. +b); P) - Fn-i(Si-1(1 +a); P) 
% = (1+r) Saba) ; (14) 
As in § 4b, we set S 
ĝi =M; Wsi 


2 


B; ` 
The strategy 7 = (8, 7) is self-financing, therefore 


Ab; Bi~1 + A% Si- = 0. 
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Hence ip 2 = 
XP, = Gi-1Bi-1 + W-15i-1 = GB Bi-1 + WSi-1, 


so that 


Bearing in mind (7) and (14) we obtain 
z  Fy-igi(Si-1;P) _ Gn-i(Ai-1:p)(1 +r) 
Bi = 
Bn b-a 
1 = 
= By (PN-i(Si-1:P) 


1+r 
b-a 


[Fn—i(Si-1(1 +b); P) — Fn—i(Si-1(1 +a); P )] i" (15) 


We now sum up the results so obtained. 


THEOREM 2. The components B = (Bili<N and 7 = (%i)icn Of a perfect hedge 


z = (3,7) in the CRR-model with fy = f(Sy) can be defined by formulas (14) 
and (15). 


COROLLARY 1. The predictable functions Bi and ¥; depend on the ‘past’ only 
through the variable S;_1: 
Di = BilSi-1), Ti = 7il(Si-1). 


COROLLARY 2. Let f = f(x) be a nondecreasing nonnegative function. Then it 


follows from (3) and (14) that if 7 = (@,7) is a perfect hedge, then ¥; > 0 for all 
i SN. 


Remark. One can interpret negative 7; as borrowing stock (short-selling). Then 
Corollary 2 means that if f(x) is non-decreasing, then no short-selling is necessary 
for perfect hedging. 


§4d. Standard Call and Put Options 
1. For a standard call option, 
(Sn) = (Sn - K)*, 
where N is the maturity time and K is the strike price. Of course, formulas for the 


rational price and perfect hedge obtained in the preceding section look more simple 
in this case. 
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By definition (3) in § 4c, 


FalS0:3) = Sock P*O -p max, sa tan (1H - x} o 


= lta 


Let 


So 
Ko = K b, N; — 
0 ofa, g 2) 


be the smallest integer such that 
1+b\ Ko 
N 
— >K. 
So(1 + a) (7) (2) 


For fy = (Sy — K)* we shall denote for brevity C(fy;P) by Cy (or by cKO 
if we want to underline the dependence on K). 

If Ko > N, then Fy (S0; P) = 0, and therefore the rational price Cy is equal 
to zero (see (8), §4c); this is understandable since we surely have Sy < K in this 
case, and the purchase of an option brings no profits. 

We shall assume for that reason that Kg < N. Then 


ee OR 


= % 3 hpa- jN- ae 


k-ko l-+r lta 
-K(1+r)7 vs cpa -pN (3) 
k=Ko 
We set 
14+6_ 
Ts fat he : 4 
ae rer (4) 
B(j, N; p) = Soha- py (5) 
k=j 


Using this notation, we can formulate the result so obtained (which is originally 
due to J. C. Cox, S. A. Ross, and M. Rubinstein [82]) as follows. 


THEOREM. The fair (rational) price of a standard European option with pay-off 
f(Sw) = (Sw - K)* is 


Cy = SoB(Ko, N;p*) ~ K(L +1)“ B(Ko, N; P), (6) 
where m 
j+a 
Kop =1 In ————- / ln ——_.. 7 
0 + [now e E] (7) 


If Ko > N, then Cy = 0. 
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2. Since 
(K ~ Sn)? = (Sn - K)* - Sy +K, 


the rational (fair) price Py of a put option can be defined by the formula 


Py = E(1 +r) (K - Sy)t 
= Cy -EQ+r) Sve Kaan, (8) 


Here E(1t+r)-\ Sy = So. Hence we have the following identity (the call-put parity): 
Py = Cy -So+K(14r)7%. (9) 


3. Let f = f(Sy) be a pay-off function and let cf) = Bo E iow) N) be the corre- 
N 


sponding rational (fair) price. 
The following observation (see, e.g., [121], [122]) shows how one can use the 


(K 


rational prices Ch ) corresponding to the pay-off (Sy — K)*, K > 0, in the search 


of the values of cf) for options with other types of pay-off functions f. 
Assume that the derivative of the pay-off function f = f(x), x > 0, can be 
T 
expressed as an integral: f’(x) = Í u(dy), where u = u(dy) is a finite measure 


(not necessarily of constant sign) on (R+, B(R+)). (If f(y) has a second derivative 
in the usual sense, then (dy) = f”(y) dy.) Then it is straightforward that 


Fe) = 10) +240) + fe = K+ Wak) 
and therefore 


FCSN) = £(0) + Sn f"(0) + [ "(Ge KP dk). (Pas): 


We consider now the expectation with respect to the martingale measure Py 
and obtain 


Efn) _ £0) | mara Of ae ONE By 
Be = Bet eos ECEE wan), 


so that, by formula (6) in § 1b, 


CH = (1 +r)" (0) + Sof'(0) + A Cy u(dK). (10) 


Note that if f(r) = (£z - K,)+, K« > 0, then (dK) is concentrated at the 


point K, (i.e., ps(dK) = 5K, }(dx)) and ch) = ce as one would expect. 
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4. Formulas (6) and (9) answer the question on the rational price of put and call 
options. It is also of considerable practical interest to the option writer to know 
how to find a perfect hedge 7 = (8,7); this can be carried out on the basis of 
formulas (15) and (14) in the preceding section. We do not analyze these formulas 
thoroughly here; we content ourselves with one simple example, the idea of which 
is borrowed from [162]. (See also [443] and a similar illustrative example at the 
beginning of this chapter.) 


EXAMPLE. Consider two currencies, A and B. Let Sn be the price of 100 units of A 
expressed in the units of B for n = 0 and 1. Let Sp = 150 and assume that the 
price Sı at time n = 1 is expected to be either 180 (the currency A rises) or 90 (the 
currency A falls). 
We write 
Sı = So(1 + p1), (11) 


and see that pı can take two values, b = i and a= —2, which correspond to a rise 
or a drop in the cross rate of A. 

Let Bo = 1 (in the units of B) and let r = 0. Thus, we assume (for simplicity) 
that funds put into a bank account bring no profits and no interest on loans is 
taken. 

Let N = 1 and let f(S1) = (Sı — K)t, where K = 150(B), i.e., K = 150 
(units of B). Thus, if the currency A rises, then a buyer of a call option obtains 
180 — 150 = 30 (units of B), whereas if the exchange rate falls, then f(S1) = 0. 

So far, nothing has been said on the probabilities of the events pı = b and 
pı =a. Assuming that A can rise or fall with probability 4 we obtain that Ef (S1) = 
30-4 = 15. A classical view, dating back to the times of J. Bernoulli and C. Huygens 
(see, e.g., [186; pp. 397-402}), is that Ef (S1) = 15 (units of B) could be a reasonable 
price of such an option. 

It should be emphasized, however, that this quantity depends essentially on our 
assumption on the values of the probabilities p = P(p1 = b) and 1 — p = P(p1 =a). 
Ifp= L, then, as we see, Ef(S1) = 15 (B). However, if p # b, then we obtain 
another value of Ef (S1). 

Taking into account that, in real life, one usually has no conclusive evidence in 
favor of some or other values of p, one understands that the classical approach to 
the calculation of rational prices is far from satisfactory. 


Rational pricing theory exposed above works under the assumption that p, 
0 < p< 1, is arbitrary. The value that must enter the (classical scheme of) 
calculations is 


In our example 
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If N = 1, then the corresponding value Ko = Ko (a,b, 1; So/K) is 1 fora = -2, 
b= F, and So = K = 150, therefore 


d = Be 2 
Cy = Sop (1 +) - kp = So pb = 150-5 - 


by (3). 

Hence the buyer of an option must pay a premium Cy of 20 units of B, which 
can now be regarded as the starting capital Xo = 20 (B) of the writer (issuer), who 
invests it on the market. 

We represent now Xo in the standard form (for (B, S)-markets) Xo = 8o Bo +70 50. 
Setting Bo = 1 and Sp = 150 we can express the capital Xo = 20 (B) as follows: 
20=0+ ý - 150. That is, G69 = 0 and yọ = a5, which can be deciphered as follows: 
the issuer puts 0 units of B into the bank account, while yp - So = 5 - 150 = 20 
units B can be converted into the currency A. 

Assume that the issuer can also borrow money (from the bank account B, in 
the currency B), which, of course, should be paid back in the future. Then we can 
represent the initial capital Xo = 20 (B) as the sum Xo = —30 + $ - 150, which 
corresponds to the portfolio (9,7) = (-30, 5) meaning that the issuer borrows 
30 units of B and can now exchange 3 - 150 = 50 units of B for 33.33 units of A. 

Assume that, as an investor on our (B, S)-market, the issuer chooses (3,771) = 
(80,70). What does his portfolio bring at the instant N = 1? 

By the assumption that Bı = Bo = 1, there will be 6; Bı = —30 units of B in 
the bank account. 

If the currency A rises (180 B = 100A), then 33.33 units of A will be worth 60 
units of B, of which 30 is the outstanding debt. On paying it back the issuer still 
has 60 — 30 = 30 units of B, which he will pay to the buyer of the option to meet 
the conditions of the contract. 

On the other hand, if A falls, then 33.33 units of A will be worth 30 units of B, 
which should be paid back to the bank. Nothing must be paid to the buyer (who 
has lost), so that the issuer ‘comes clean’. 

Our choice of the portfolio (31,71) = (—30. 3) may appear ad hoc. However, 
these are just the values suggested by the above theory. 

In fact, by formula (14) in § 4c the ‘optimal’ value y1 = 71 (S0) that is a compo- 
nent of a perfect hedge can be calculated as follows: 


Fo(So(1 +b); p) — Fo(So(d + @);7) 


71(So) = Solb — a) 
AH) Fola) FCS) 
So(b — a) So(b — a) 
_ (Sot +6) - K)F _ b 175 1 


Sob -a) b-a 1/5+2/5 3° 
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The value 81 = 8o can be defined from the condition 
Xo = Bo + Yo50- 


Since Xo = 20, yo = L, and Sp = 150, it follows that 6, = 8o = —30, just as chosen 
above. 

It is clear from these arguments that the buyer’s net profit V (S1) (as a function 
of Sı for fixed K) can be described by the formula 


V(S1) = (S1 - K)* - Cy. 


The graph of this function is as follows: 


V(S1) 


K+ Sı 


-Cı 


Of course, the question on the writer’s profits also seems appropriate in this 
context. 

It is easy to see that there are none in the above example, for both cases of 
raising and falling currency A. So, how can there be someone ready to float options 
and other kinds of derivatives on financial markets? 

In fact, the situation is more complex because, first of all, one must take into 
account overheads, the broker’s commission, taxes, and the like, which, of course, 
increases the size of the premium calculated above. For instance, the commission 
can be regarded as the writer’s profit. Moreover, one must bear in mind that the 
writer has control (not necessarily for long) over the collected premiums and can 
use these funds for gaining some money for himself. 

Some may also wonder at the variety of kinds of options and other derivatives 
traded in the market. 

One possible explanation is that there are always some who expect currencies, 
stock prices, and the like to rise or fall. Hence there must exist someone who derives 
profit from this. This is what issuers are doing by floating call options (designed for 
‘bulls’), put options (designed for ‘bears’), or their combinations with derivatives 
of other types. 
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§4e. Option-Based Strategies 
(Combinations and Spreads) 


1. In practice one can encounter nost diversified kinds of options and their com- 
binations. We listed several kinds of options (‘with aftereffect’, ‘Asian’, etc.) in 
Chapter I, §1c. Some of them, due to their peculiarity and intricacy, are called 
‘exotic options’ (see, e.g., [414]). 


We now list (and characterize) several popular strategies based on different 
kinds of options. Usually, one classifies these strategies between combinations and 
spreads. The distinction is that combinations are made up from options of different 
kinds, while spreads include options of one kind. (See, e.g., [50] for a more thorough 
description, details of corresponding calculations, and a list of books touching upon 
financial engineering, which considerably relies upon option-based strategies. ) 


2. Combinations 


Straddle is a combination of call and put options for the same stock with the 
same strike price K and of the same maturity N. The buyer’s gain-and-loss function 
V(Sn) (= f(Sn) ~ Cy)) for such a combination is as follows: 


V(Sn) = |Sn - K|- Cyn. 


Its graph is as in the chart below. 


V(Sy) 


SN 


Strangle is a combination of call and put options of the same maturity N, but 
of different strike prices Kı and K2. The typical graph of the buyer’s gain-and-loss 
function V(Sj) is as follows: 


4. European Options on a Binomial (B, S')-Market 605 


V(SN) 


Analytically, V (Sy) has the following expression: 
V(Sy) = Sn - Kall(Sy > K2)+|Sy — Ki H(Sn < K1)— Cn. 


Strap is a combination of one put option and two call options of the sarne 
maturity N, but, in general, of different strike prices Kı and K2. If Ky = Ko = K, 
then 


V(Sn) = 2|Sn - K|I(Sy > K)+|Sn — K[I(Sy < K) = Cn 
The graph depicting the behavior of V (Syn) is now asymmetric: 


V(Sn) 4 


SN 


Strip is a combination of one call option and two put options of the same ma- 
turity N, but, in general, of different strike prices Kı and K2. The gain-and loss 
function is 


V(SN) =|Sn - Koll (Sn > K2) + 2|Sn - Kı (Sn < Kı) — Cy; 


and its graph has the following form: 
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V(SN) 4 


3. Spreads 


Bull spread is the strategy of buying a call option with strike price Kı and 
selling a call option with (higher) strike price K2 > Ky. Here 


V(Sy) = |Ko = KılI(SN > K2) + ISN - Kı] I(Kı < SN < Kə) = Cy, 


and the graph of this function is as follows: 


V(Sw) $ 


\Ky Ko SN 


-Cy 


It is reasonable to buy a bull spread when an investor anticipates a rise in 
prices (of some stock, say), but wants to reduce potential losses. However, this 
combination restricts also the potential gains. 


Bear spread is the strategy of selling a call option with strike price K, and 
buying a call option with strike price Ky > Ky. For this combination 


V(Sn) = -|Ko - Kill (Sn > Ko) + |Sn ~ Ky|I(K1 < Sn < K2)+Cy. 


The corresponding graph is as follows: 
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V(Sn) 


Cy 


Kı \Ko SN 


Such a combination makes sense if the investor anticipates a drop in prices, but 
wants to cap losses due to possible rises of stock. 
As regards other kinds of spreads, see [50], § 24. 


4. On securities markets one comes across other combinations besides the above- 
mentioned ones, involving standard (call and put) options. For instance, there 
exists a strategy of buying options (a derivative security) and stock (the underlying 
security) at the same time. Investors choose such strategies in an attempt to insure 
against a drop in stock prices below certain level. If this occurs, then the investor 
who has bought a put option can sell his stock at the (higher) strike price, rather 
than at the (lower) spot price. (See [50], § 22.) 
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§5a. American Option Pricing 


1. For American options the main pricing questions (both in the discrete and the 
continuous-time cases) take the following form: 


(i) what is the rational (fair, mutually acceptable) price of option contracts 
with fixed collection of pay-off functions? 
(ii) what is the rational time for exercising an option? 
(iii) what optimal hedging strategy of an option writer ensures his ability to 
meet the conditions of the contract? 


In the present section, concerned with American option pricing in the discrete- 
time case, we pay most attention to the first two questions, (i) and (ii). In principle, 
questions of type (iii), on particular hedging strategies, are answered by Theorems 2 
and 3 in § 2c. 


2. We shall stick to the CRR-model of a (B, S)-market described in § 4b. That is, 
we assume that AB, = rBy_1 and AS), = pnSp-1; here p = (pn) is a sequence of 
independent identically distributed random variables such that P(pn = 6) = p and 
P(pn = a) = q, where -1 <a<r<bpt+q=1,0<p<l. 
An additional assumption enabling us to simplify considerably the analysis that 
follows is that 
b=X~1 and a=A!-1 (1) 


for some A > 1. 

Thus, in place of two parameters, a and 6, determining the evolution of the 
prices Sp, n > 1, we must fix a single parameter A > 1, which defines a and b by 
formulas (1). 

Clearly, in this case we have 


Sn = SoA th Fen, (2) 
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where P(e; = 1) = P(p; = b) = p and P(e; = —1) = P(p; = a) = q (cf. Chap- 
ter II, § 1e). 

Assuming also that Sp belongs to the set E = {Ak k= 0,+1,...}, we see that 
for each n > 1 the state Sn also belongs to E. 

The sequence S = (Sn)nzo described by relation (2) with Sọ € EF is usually 
called a geometric random walk over the set of states E = {AF k = 0,+1,...} 
(cf. Chapter II, § 1e). 

Let z € E and let Pz = Law ((Sn)nz0 |P, So = x) be the probability distribution 
of the sequence (S;)n>0 with respect to P under the assumption that So = z. 

In accordance with the standard nomenclature of the theory of stochastic pro- 
cesses, we can say that the sequence S = (Sn)nz0 with family of probabilities Pg, 
x € E, makes up a homogeneous Markov random walk, or a homogeneous Markov 
process (with discrete times). 

Let T be the one-step transition operator, i.e., for a function g = g(t) on E we 
set 


Tg(z) = Esg(S1) rE, (3) 


where E, is averaging with respect to the measure Pg. 
In our case (2) we have 


Tg(z) = pg(às) + (1 - p)9(5)- (4) 


3. The (B,S)-market described by the CRR-model is both arbitrage-free and com- 
plete, and the unique martingale measure P has the following properties: 


r-a 
b-a’ 


So b-r 
b-a’ 


P(e; = 1) = P(p; = b) 


(See, e.g., Chapter V, § 1d.) 

The ‘arbitrage-free martingale’ ideology of Chapter V requires that all proba- 
bilistic calculations proceed with respect to the martingale measure P, rather than 
the original measure P. To avoid additional notation we shall assume that P = P 
from the very beginning, so that 


r-a b-r 
= d = 
BS Gea ee ee (5) 
Bearing in mind (1) we see that 
oal- o A-a 6 
p= j Lal’ 1= 3al.) (6) 


where a = (14+1r)7!. 
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Let f = (fo, fi,---) be a system of pay-off functions defined, as usual, on a 
filtered probability space (Q, F, (Fn)nz0:P) with Fo = {9, Q}. 

In accordance with §2a, let MN be the class of stopping times r such that 
n ST SN. Let MY be the class of finite stopping times such that T > n. 

The buyer of an American option chooses himself the instant 7 of exercising it 
and obtains the amount of f+. If the contract is made at time n = 0 and its expiry 
date is n = N, then the buyer of an American option can choose any r in the 
class MY as the time of exercising. Of course, the writer must allow for the most 
unfavorable buyer’s choice of r and (in incomplete markets) the ‘Nature’s choice’ 
of a martingale measure among the possible ones. Thus, in accordance with § la, 
the writer of an American option must opt for a strategy bringing about American 
hedging. 

Our (B, S)-market is complete, and the upper price of American hedging (see (5) 
in § 2c) 


Cn(f;P) =infly: 3r with XẸ =y, XT > fr (P-as.), Yre MF}, (7) 


which can be reasonably taken as the price of the American option in question, can 
be calculated by the formula 


Cy(f;P) = sup ByE £ (8) 


remy 
(see (19) in § 2c). 
Recall that E here is averaging with respect to the (martingale) measure P. 


4. We put the accent in the above discussion on the questions how and under what 
conditions the seller of an option can meet the terms of the contract. 

According to the general theory of American hedges (subsection 2), the premium 
Cn (f; P) for the option contract defined by (8) is the smallest price at which the 
writer can meet these terms. T 

The buyer is aware of this fact, and the price Cy (f; P) is in this sense acceptable 
for both parties. Now, in accordance with the general theory, the writer can select 
a hedging portfolio 7 such that its value X is at least f, for each 7 € mY. 

We now discuss the question how the buyer, who agrees to pay the premiun 
Cn (f: P) for the contract, can choose the time of exercising it in the most rational 
way. 

Clearly, if it is shown at time o when X7 > fs, then the writer obtains the 
net profit XT — fo after paying f, to the buyer. Hence the buyer should choose 
instant ø such that X* = f,. Such an instant actually exists and, as follows from 
Theorem 4 in § 2c, it is the instant ae obtained in the course of the solution of the 
optimal stopping problem of finding the upper bound 

sup BoE t (9) 
remy B7 
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5. We see from (8) that finding the price Čy (f; P) reduces to the solution of the 
optimal stopping problem for the stochastic sequence fo, fi,..-, fN- 

In §§5b,c, following mainly [443], we shall consider the standard call and put 
options with pay-off functions fn = (Sn — K)* and fn = (K —S,)* (or slightly 
more general functions fn = B"(S, — K)t and fn = B"(K — S,)T), respectively. 
Together with the Markov property of the sequence S = (Sn)n 0, this special form 
of the functions f, enables us to solve the optimal stopping problems in question 
using the ‘Markovian version’ of the theory of optimal stopping rules described 
in § 2a.5. 


§5b. Standard Call Option Pricing 


1. We consider a standard call option with pay-off function that has the following 
form at time n: 
fn(z) = B" (2 - K)t, ré FB, (1) 


where 0 < 8 < 1, E = {r = AF: k =0,+1,...}, andA>1. 
For0 <n <N we set 


VN (z)= sup Ezs(aß) (S, — K)t, (2) 
TEMN 


where Sy, = SAH +En+k and Sn = x. 
It is worth noting that 


VN (2) = (a8)"V (2) (3) 


and, in accordance with relation (8) in §5a (and under the assumption Sp = 2), 
the price in question is 


Cn (f; P) = V (2). (4) 
By Theorem 3 in §2a and our remark upon it, 
Vo" (a) = Q™g(2), (5) 
where g(x) = (x — K)+ and 
Qg(z) = max(g(x),a8T9(x)). (6) 


The optimal time Ta exists in the class mY and can be found by the formula 


TÈ =min{0 <n < N: VT (Sn) = g(Sn)}- (7) 
Setting 
N = {2 € E: V” (z) = g(x)} 
= {x € E: VA (z) = (a8)"9(2)}, (8) 
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we see that 
re’ = min{0 <n < N: Sn E€ DN}. (9) 
Thus, given a sequence of stopping domains 
DË CDN C-.-CDN=E (10) 


and a sequence of continuation domains 
CDON oa CNS g (11) 


with cN = E\ DN: we can formulate the following rule for the option buyer 
concerning the time of exercising the contract. 

If So € DY, then 7A = 0; i.e., the buyer must agree at once to the pay-off 
(So - K)t. 

On the other hand, if So € EN =E\ DY (which is a typical situation), then 
the buyer must wait for the next value, Sı and take the decision of whether oh =] 
or TN > 1 depending on whether Sj € DN or S1 € CN, and so on. 

In our case of a standard call option it is easy to describe, in qualitative terms, 
the geometry of the sets DY’ and CÑ, 0 < n < N, and therefore also the buyer’s 
strategy of choosing the exercise time. 


2. We see from (4)-(7) that finding the function V“ (x) and the instant rg reduces 
to finding recursively the functions V (z) = Q"g(z) for n = 1,2,..., N. 

By assumption, 0 < 8 < 1. We claim that the case of 8 = 1 is elementary. 

In fact, the sequence (&”Sn)n>z0o is a martingale with respect to each measure 
Pz, x € E; therefore (a"(S, — K))ns0 is a submarténgale and, by Jensen’s in- 
equality for the convex function z ~ r+, the sequence (a” (Sn — K)*)nz0 is also a 
submartingale. 

Hence for each Markov time T, 0 < T < N, we have 


Esa (S, - 1)+ < Esa™ (Sy -1)t (12) 


by Doob’s stopping theorem (Chapter V, §3a). This immediately shows that we 
can take ai = N as an optimal stopping time in the problem 


sup Eza? (S, — K)*, 
remy 


and therefore if Sg = x, then 
Cn (f: P) = Vo (2) = Eza™ (Sy — K)+. (13) 


Translating this into more practical terms we obtain the following result of 
R. Merton [346]: 


if the discount factor B is equal to 1, then the standard American and 
European call options are ‘the same’. 


In addition, the value of Ver (z) can be found by formula (6) in § 4d. 
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3. We consider now a more interesting case of 0 < 8 < 1. 
THEOREM 1. For each N > 0 there exists a sequence of numbers tÑ € E U {0}, 
0 < n < N, such that 

DN ={réE:re [x ,co)}, 

CN ={reE:2€ (0,2¥)} 


and 
rE =minfO<n<N:S,€ De =min{f0 <n SN: SnE [ry co)}. 
Moreover, 
O=aN <N <e <a! (14) 
and 
N g(x), z € Di = [xh , 00), 
Vo (2) = N N N (15) 
Q“ g(x), zE C9 = (0,29). 
The rational price Cy(f; P) is equal to Vi" (So). 
Proof. We set for simplicity K = 1, set consecutively N = 1,2,..., and analyze 


Q"g(z) for n SN. 
Let N = 1 and let z = 1 be the initial point, i.e., x = A°. By formula (4) in § 5a, 
we see for the function g(x) = (x — 1)* (bearing in mind the inequality À > 1) that 


T9(1) = pg(A) + (1 - p)g(A7") = P(A - 1) > 0, 
Qg(1) = max(g(1),@8Tg(1)) = max(0,a@p(A — 1)) = aßp(à — 1) > 0. 


Hence the use of the operator Q ‘rises’ the value of g = g(x) at the point z = 1 
to Qg(1) = aGp(A — 1). 
In a similar way, 


Ty(X) =P? = 1) and QA) = (A~ 1) max( 1,054). 


1 
Naat then Q does not change the value of g(A) = A- 1. However, 


A-1 
if 8> a’ then Q ‘rises’ the value of g(A) to B(A - a) (> à- 1). 
-a 
Now let z = A*, k > 1. Then 


Hence if 8 < 


Tg(A*) =*q-!-—1 and Qg(A*) = max(A* - 1, B(AF -a)). 
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We note that 
QgQ*) = g(*) => ak-1> pO -a => M(1- 8) >1-a6. (16) 


Since 
AFG- 8) 21-8 => A*t(1- 8) > 1-a8, 


it follows that 
Qg(A*) = gF) => Qg(A**?) = gk), 


which can be interpreted as follows: if AK e Di, then the points REAL ht? aoe 
belong to D. 

By (16) we obtain that for sufficiently large k the quantity AF belongs to D4. 

Let rå = min{x € E: Qg(z) = g(x)}. Then it follows from the above that 
[xj 20) (e D4. Moreover, we can say that (xd, y= Di. 

Indeed, let z = A* with k < -1. Then Tg(x) = 0, Qg(x) = 0, so that both 
instantaneous stopping at these points and continuation of observations (by one 
step) bring one no gains. For that reason we can ascribe the points « = AF with 
k < -1 to the continuation domain Gt: Of course, this domain also contains 
z = 0 = 1 and the points x = A* with k > 1 such that A* < x}. 

Thus, if N = 1, then 


a z { 0 for Sp € [z4, œœ), 
1 for So € (0,28). 


The analysis for N = 2,3,... proceeds in a similar way; the only difference is that 
while we considered the action of Q on g(x) on the first step we shall now study its 
action on the functions Qg(x), Q?9(z), ..., each of them downwards convex (as is 
g = g(x); see Fig. 58 below) and coinciding with g(x) for large z. These properties 
mean that for each N there exist ry such that DY = [rẸ , o0). 

Not plunging any deeper into the detail of this simple analysis we note also 
that it is immediately clear for N = 2 that D? = Di and, therefore, z? = zh. 
Considering the set D = {x: Q?g(x) = g(x)} = {2: g(x) > TQg(x)} we see that 
D? = [x2, 00) for some r, In addition, 0 = rž < r? cad es so that 7a has the 
following structure: 


0 for So € [z2, 00), 
Tò = 1 for So € (0, 22), S1 E€ [x?, 00), 
2 for So € (0,22), S1 € (0, z2). 


Fig. 57 and 58 below give one a clear notion of the structure of the stopping 
domains D and the continuation domains CÑ and also of the functions Vi) = 


QNg(s). 
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oe stopping domains - <°: 


So 


continuation domains 


r 


0 1 2 N-1 N 


FIGURE 57. Call option. Stopping domains 

N 
DF = [zð , œ), DY = [e], 0), .-., DN = [0,00), 
and continuation domains k 
CO" = (0,28), Cl’ = (0,27), CN = 2. 
The trajectory (So, S1, S2,-...) leaves the continuation 
domains at time 76 


——> 
0 NP eK oO) d? 3 c= a* 


FIGURE 58. Graphs of the functions g(r) = (x — 1)" and 
VAN (2) = QNo(x) for a discounted call option with pay-offs 
fn = B"g(z),0< 8B <10<en<eN,A>1 
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4. As follows from the above, finding the rational price Cn(f; P) for Sg = x reduces 
to finding the functions ve (£) = QN g(x), which can be calculated recursively by 
the formula 


Q" g(x) = max(Q”~"9(x),a8TQ"19(z)) 
max(g(z), a8TQ”~'g(zx)). (17) 


(See also (441; 2.2.1].) 
Clearly, VV (z) < ye +1 (r), and therefore there exists 


V*(x) = lim VÒ (2). (18) 


By Theorem 4 in Chapter V, §6a the function V* = V*(z) is the smallest af- 
excessive majorant of the function g = g(x), ie., V* = V*(x) is the smallest 
function U = U(x) such that U(x) > g(x) and U(r) > (aZ)TU(z). In addition, 
V* = V*(2) satisfies the equation 


V*(x) = max(g(z), (a@@)TV*(z)), (19) 


( 

following from (17) and (18). 
By the same theorem V (z) is just the solution of the optimal stopping problem 

in the class MẸ = {T = T(w): 0 < T(w) < oo, w E Q}, i.e., 


V*(z) = aD Es(ap) g(S7). (20) 


The knowledge of V* = V*(x) can be interesting also in the following respect: 
V*(So) is equal at the same time to the rational price 


Coo(f;P) = inf{y: 3x with Xf =y, XT > B7g(Sr), Yre MX} (21) 
for the system f = (fn) of pay-off functions 
fn(z) = B"(2- K)*, n>0, (22) 


in the case when the buyer can choose an arbitrary stopping time r in the set M° 
to exercise the contract. (The corresponding proof is similar to the proof of the 
theorem in § Ic.) 

The analysis of options with exercise times in the set Mtg in place of the class 
my with finite N can appear unappealing from the practical point of view. How- 
ever, one should take into account that the discount factor 8, 0 < 8 < 1, does not 
allow optimal stopping times to be ‘excessively large’. Incidentally, the analytic 
solution of problems of type (20) is much easier than the solution of problems of 
the type (2) with finite N; and if N is sufficiently large, then by V*(x) we can make 
a (probably, crude) guess about the values of the function V4 (z). 
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5. We now proceed to the function V*(z), which, as we know, must satisfy (19). 
Note that DĚ > Des. and therefore ae < PAE Hence three exists the limit 


Nim af’ = x*, and by (19) the function V*(x) must have the following form: 
00 


V*(2) = { ee a (23) 


(ap)TV* (1), z< a". 


We point out that both ‘boundary point’ z* and function V*(z) are unknown and 
must be found here. Problems of this sort are referred to as free boundary or Stefan 
problems (see, e.g., [441]). 

In general, a solution (x*, V*(x)) to (23) is not necessarily unique and we may 
need some additional conditions to select the ‘right’ solution. We discuss below the 
arguments behind these additional conditions. 

Let C* = (0, 2*) and let D* = [r*,0o). Then the function V*(z) in the domain 
C™* satisfies the equation 

g(a) = aß Tos), (24) 


i.e. (in view of (4) in § 5a), 
p(z) = a8 |pp(Az) += »o($)]- (25) 


In accordance with the general theory of difference equations (see, e.g., [174]) 
we shall seek the solutions of this equations in the form (x) = 27. Then y must 
be a root of the equation 


T= plapa? +a(1-— p)rA~"}. (26) 


i b= A—1, and a = àT! — 1. We have in fact found p 
—a 


Recall that p = ; 


from the condition 
gł + Pl 


l+r 
(see (4) in Chapter V, § 4d), i.e., the relation 


, 


adp+a(1—p)A7} = 1. (27) 


Comparing (26) and (27) we see that if 8 = 1, then (26) has the root 7 = 1 
and another root y2 such that 


1~p 
NY = ——. 2 
a (28) 
Since 
1-p_ar—1 1 
Ap Aa i 


it follows that y2 < 0. 
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Hence if 8 = 1, then the general solution to (26) has the following form: 
p(x) = cT + cer”, (29) 


where y2 < 0. 
By the nature of the problem the required function V*(x) must be nonnegative 

and nondecreasing. Hence cz = 0, and therefore 

-1 


* 
12, Lar, 


* 
> E22, 


V*(x) = { (30) 
c 
where z* and cı are still to be defined. 
By the submartingale property of the sequence (&” (Sn — 1)T)n 
to each measure Pz, x € E, we obtain that z* = ov, because if 8 = 
x > 1, then 


>0 with respect 
1 for each point 


aT g(x) > g(x), 


and therefore it would certainly be ‘more advantageous’ to make at least one ob- 
servation than to stop immediately. 

Further, cı > 1 in (30), for if cy < 1, then z* < oo. 

On the other hand, cy cannot be larger than 1 in view of the (additional) prop- 
erty that V*(z) is the smallest a-excessive majorant of g(x), and the smallest 
function cız with c1 > 1 clearly has the coefficient cy = 1. 

Thus, for 8 = 1 and g(x) = (x — 1)+ we have V*(r) = sup Ega7g(S;) = 2, 

TEMG 
and there exists no optimal stopping time (in the class M3). However, for each 
e > 0 and each z € E we can find a finite stopping time Tg, such that 


Ena" 9(S 2) > V*(z)-e. 


(See [441; Chapter 3] for greater detail.) 


6. Assume now that 0 < 8 < 1. Then equation (26) has two roots, y1 > 1 and 
y2 < 0, such that the quantities yj = A% and y2 = A7? that are the solutions of 
the quadratic equation 

y = Blapy” + a(1 — p)] (31) 


can be expressed as follows: 


A A2 A A2 
URE tyr 8E u= As (32) 


where A = (a@p)~! and B = (1 — p)p™ t. 
Thus, if 0 < 8 < 1, then the general solution p(z) of (25) can be represented as 
the sum: 
plr) = ar" + cor”. (33) 
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For the same reasons as in the case 8 = 1, the coefficient cz must here be equal to 
zero and it follows from (23) that the required function is 


x-1l, rt>2"*, 


v= f (34) 


CaN, aa A 
where x* and c* are constants to be defined (see (40)—(43) below.) 
To find x* and c* we use the observation that V*(rz) must be the smallest 
aB-excessive majorant of the function g(x) = (z — 1), x € E. 
The following reasoning shows how, in the class of functions 


z% = 2 > 
Vane { 7 ea (35) 


with z, € E, € > 0, and y1 > 1, one can find the smallest majorant of the function 
g(x) = (z — 1)*. (Then, of course, we must verify that the function so obtained is 
a3-excessive.) 

To this end we point out that for sufficiently large € the function ye(z) = r 
is knowingly larger than g(x) for all z € Æ. Hence it is clear from (35) how one can 
find the smallest majorant g(x) among the functions Ve(z; £) 

We now choose € sufficiently large so that yz(x) > g(x) for all z € E, and then 
make € smaller until, for some value of ¢,, the function yg, (z) ‘meets’ g(x) at some 
point T1- 

The functions ye(z), x € E, are convex, therefore, in principle, there can exist 
another point, Z2 € E, such that ¥2 > Tı and pz (¥2) = g(T2). 

In our case the phase space E = {x = \¥,k = 0,+1,...} is discrete. However, 
if we assume that A = 1 + A, where A > 0 is small, then the distance between z1 
and T2 is also small and, moreover, these points ‘merge’ into one point, F, as A | 0. 

Clearly, 7 is precisely the point in the interval (0,00) where the graph of ys(r) = 
Tr for some value of € touches the graph of g(r) = (x — 1)T, x € (0,00). 

Obviously, č and 7 can be defined from the systern of two equations 


paz) = 9(Z), (36) 
d 
ew) a aa 
dr z=% dr L=E+ 
which yields 
z yı ~_ (n-m! 
T= ——.,, ¢= =. 38 
y- 1 y ( ) 


Moreover, it is clear that the function 


A E 39 
TaT con, r< ee) 
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is a good approximation to the least functions of the form (35) if A > 0 is sufficiently 
small. (Cf. formula (37) in Chapter VIII, § 2a). 


Remark. We must point out condition (37) of ‘smooth fitting’, which enters our 
discussion in a fairly natural way. This condition often plays the role of an additional 
requirement in optimal stopping problems and enables one to single out the ‘right’ 
solution of a problem (see [441] and Chapter VIII in the present book.) 

The above-described qualitative method of finding the smallest majorant of g(x) 
brings one after more minute analysis (see {443]) to the following ‘optimal’ values 
x* and c* of 7 and T ensuring that the corresponding function V*(z) = Ve (z; 2*) 
is not only the smallest majorant of g(x), but also an a@-exrcessive majorant: 


c* = min(cj, ¢3), (40) 
where 
cf = (alosa 7] — 1) aTa llosa 4] (41) 
d= (alosa 4] = 1) ATM lloga z- (42) 
and 


Alosa z], if e = ef, 
r* = (43) 


aloga Z]+1, if c* = cf 


(here [y] is the integer part of y and F is defined in (38)). 

It is obvious that the function V*(x) so obtained is a@-excessive for z < x* 
because, by construction, aBTV*(xr) = V*(xz) for such z. On the other hand, if 
xr > x”, then we can verify directly that a@TV*(xr) < V*(x) once we take into 
account (40)-(43) and the fact that V*(«) = z — 1 for such z. 


7. By Theorem 4 in §2a our function V*(z) is precisely equal to the supremum 


sup E;(a@@)7 g(S-), and the instant 
TEMPS 


T* = inf{n: V*(Sn) = 9(Sn)} = inf{n: Sp > 2*} 


is an optimal stopping time, provided that Pz(r* < œ) = 1, z € E. 
Clearly, 


P.(t* > N) = Pz (max Sn oa) 


= Eten * 
= Pz (So ee <T ) (44) 


and since P(e; = 1) = p and P(e; = —1) = q, the probability on the right-hand side 
converges to zero as N > œ for p > q. 
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By (5) the inequality p > q is equivalent to the relation 
a+b 
TS 3 
Bearing in mind that b = À — 1 and a = A~! — 1 we see that P;(r* < co) = 1 for 
each z < x*, provided that 


l (45) 


A+A71 
r> = -n (46) 


On the other hand, if z > x*, then Pz(T* = 0) = 1 without regard to (46). 
Summing up we arrive at the following result. 


THEOREM 2. Assume that 0 < 8 < 1 and that (46) holds. Then the rational price 
Coo(f;P) of an American call option with pay-offs fn = B° (Sn —1)t, n > 0, is 
described by the formula z 

Calf; P) = V* (S0), 


where 
So-1, So >2"*, 


V*(So) = 

(So) { Sa -80 <2": 

and the constants c* and z* can be found by (40)-(43). The optimal time for 
exercising the option is T* = inf{n: Sn > x*}. In addition, 


V*(So) = Es (ab) (Sr = 1)t. 


§5c. Standard Put Option Pricing 


1. The pay-off functions for a standard put option are as follows: 


where 0 < 8 < 1, E = {y = àF: k =0,41,...},A>1. 
By analogy with the preceding section, we set 


Va (u) = sup Ey(a8) (K - S-)+ (2) 
rem 
and 
V*(y)= sup Ey(a@3)"(K — S+)”. (3) 
TEMG 
These values are of interest because 
Vo\(y)=Cw(f:P), y= So, (4) 
and E 
V*(y) = Cæ(f;P) y= So, (5) 


where the prices Cy (f; P) and Čæ(f; P) for the system f = (fr)n>o of functions 
fn = fay) defined by (1) are as in formula (7) in § 5a and formula (21) in § 5b, 
respectively. 
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THEOREM 1. For each N > 0 there exists a sequence yN, 0 <n <N, with values 
in E U {+00} such that 


DY = {y€ E: y€ (0,yn'}}, (6) 
CN = {ye E: y€ (yn ,co)} (7) 
and 
T = minfO<n<N: Sn E DY 
=minfO<n<N:S,€ (0, yi} 
Moreover, 
Yo << YN SYN =O (8) 
and 
N gly), y € DẸ = (0, yg], 
Q” gly), y E€ Cy = (y0 œ). 


The rational price Cn (f;P) is equal to VA (So). 


Proof. This is similar to the case of call options discussed in § 5b; the proof is based 
on the analysis of the subset of points y € E at which the operators Q” increase 
the value of the function g(y). 


It is worth noting that, of course, the operator Q raises the value of g(y) at 
the point y = K (we have assumed above for simplicity that K = 1) and Qg(y) = 
g(y) = 0 for y > K. Hence these values of y € E can be ascribed both to the 
stopping domains and to the continuation domains. As seen from (6) and (7), we 
have actually put these points in the continuation domains. 


2. We consider now the question of finding the function V*(y) (= im vw), 
ae.) 


the quantity y* = aim, v, and the optimal time 7* such that 


V*(y) = Ey(a@)” (K - Sr*)* (10) 


(for simplicity we set K = 1 again). 
Let C* = (y*,œ) and let D* = (0,y*|. As in §5b, we see that the function 
V*(y) in the domain C* is a solution of the equation 


ply) = ap |pe(ry) +a- r)o(=)]- 


Its general solution is cyy7 + coy? where yı > 1 and y2 < 0 (see (31) and (32) 
in § 5b). 
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continuation domains ° 


— 
0 1 2 rÈ N-1 N 


FIGURE 59. Put option. Stopping domains 

N N N = N N _ 
Do = (0,40 ],--»,Dn-1 = (0, yn-1], Dn = (0,00), 
ong continuation domains N + 
Co m (Yo ,O),..., CN—1 z (YN-1:%) CN =ø, 
The trajectory (S0, S1, S2,...) leaves the continuation 
domains at time 79 


0 M2 ee Re oe y= 


FIGURE 60. Graphs of the functions g(y) = (1 — y)" and 
vi (y) = QN gly) for a put option with pay-offs fn = B"g(y), 
where 0< 8<1,0<n<N,andà>1 
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Since V*(y) < 1, it follows that cı = 0, so that we must seek V*(y) in the class 


of functions B 
l=y, y<y, 


Ga { y, y>, 


where the ‘optimal’ values c* and y* of € and ¥ are to be determined on the basis of 
the above-mentioned (§5b.6) additional conditions that V*(y) = Ve (y; y*) must 
be the smallest a@-excessive majorant of g(y) = (1 — y)*. 

Following the scheme (exposed in § 5b) of finding, for small A = 1— A > 0, 
approzimations č and y to the parameters c* and y*, we see that they can be 


determined from the system of equations 


(11) 


gely) = 9(Y), 
dge(y) _ dg(y) (12) 
dy lye W Nya g_ 
Solving it we obtain 
hl 
~_|_% | z- — lal (13) 
72-1 lyg — 1|2=1 


Once we know the values y and č corresponding to the ‘limiting’ case (A | 1) we 
can find (see [443]) the values of y* and c* in the initial, ‘prelimit’ scheme (with 
A > 1) by the formulas 


c* = min(cj, c3), (14) 
where 
c =(1- loga 91) \~7e[los vd (15) 
ch=(1- aloga 7+1) \~r2[log, y]—72 (16) 
and 


P Alloga Il, if c* = cj, 
y= (17) 


Alosa #41 if ct = C5. 


The fact that the so-obtained smallest majorant V*(y) of g(y) = (1 — y)* is af- 
excessive can be established by an immediate verification. 
We finally observe that the condition 


a+b A+ NF 
2 2 
(cf. (45) in § 5b) ensures that Py(T* < 00) = 1 for y € E and 7* = inf{n: Sn < y*}. 
(If y < y*, then Py(r* = 0) = 1.) 
Thus, if condition (14) holds, then 7* is optimal in the following sense: prop- 
erty (10) holds for all y € E. 


rsS 


1 (18) 
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THEOREM 2. Assume that 0 < p <1 and condition (18) holds. Then the rational 
price Coo(f;P) of an American put option with pay-offs fn = P” (1 — Sn)", n > 0, 
is defined by the formula 


Cool f;P) = V*(So), (19) 
where 
1-— So, So<y’*, 
V*(So) = 20 
(So) { Cae. So >y*, 0) 


and the parameters c* and y* can be found by (14)-(17). The optimal time of 
exercising the option is T* = inf{n: Sn < y*}. Moreover, 


V*(So) = Es (ap) (1 = Sr)”. 


§5d. Options with Aftereffect. ‘Russian Option’ Pricing 


1. The pay-offs fn for the put and call options considered above have the Markovian 
structure: 


fn =6"(Sn-K)* and fn =6"(K - Sn), (1) 


respectively. 
It is of interest both for theory and financial engineering to consider also options 
with aftereffect. Examples here can be options with the following pay-off functions: 


fn = B” (aSn ~ min Sr)”, (2) 
pan (mex, S,- aSn) (3) 
or with pay-offs 

n 4 

fa = & (es, 7 5 s.) , (4) 
k=0 

nm ody 

in = P(E 54 ~ 05s (5) 
k=0 
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Options with pay-offs (4) and (5) are called Asan (call and put) options. Call 
and put options with pay-offs (2) and (3) have been considered for a = 0 in [434] 
and [435], where they are called ‘Russian options’ (see also [118] and [283]). In 
what follows we stick to [283]. 


2. We consider the CRR-model in which pn can take two values, \—1 and A7! — 1, 
where A > 1. In addition, for definiteness, we shall consider an American put option 
with pay-off function (3), where 8, 0 < 8 < 1, is the discount factor. 

In accordance with the general theory (see subsection 2), the rational price Cof 
such an option can be found by the formula 


C= sup Ea’ fr, (6) 
TEMG 


where a = (1+ 1r)~! and E is averaging with respect to the martingale measure P 
such that p and q are described by formula (6) in § 5a. 
Since 


A $ + 
C= Be E(af) (ymax Sp — aS, ) (7) 

and Sp = Soàĉ1t tEn, the quantity Ĉ is definitely finite (C < So) if 
aBrA <1. (8) 


We set Yp = max Sp. Clearly, 
kxn 


Yn = max{Yn—1, Snb- (9) 


Moreover, (Sn, Yn)nzo is a Markov sequence and, in principle, one can solve the 
optimal stopping problem (7) on the basis of general results on optimal stopping 
rules for two-dimensional Markov chains. (see [441] and § 2a). 

However—and this is remarkable—our two-dimensional Markov problem can be 
reduced to some one-dimensional Markov problem if one uses the idea of change 
of measures and chooses an appropriate discounting asset (numéraire). (See also 
Chapter VII, § 1b on this subject.) 

Let r € MEO. We recall that Bn = Boa~™ with a = (1+ r)~+, so that 


+ + 
Elab)" ( max S, — aS) = E(ap)" a — a) Sy 


O<rgr Sr 
Y, t 85/80 
= SoE| 87 . : 
0 E (= a) ara E 
We set Zn = Sn/So . Then Zn > 0, and the sequence Z = (Zn, Fn, P)n>0 is a 
Bn/B 2 
n/ 20 


P-martingale with EZ, = 1. 
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For A € Fn we set 7 
P,,(A) = E(ZnI4). 
Clearly, the measures in the collection (Pn)n>0 are compatible (i.e., Prot | Fn = Pn, 
n > 0) and by Ionescu Tulcea’s theorem on the extension of a measure (see, e.g., 
[439; Chapter II, §9]) there exists a measure P (in the space 2 = {—1,1}°) such 
that P| Fn = Pn, n > 0. 
Hence 


+ oof Ve oa 
Elop (pax S- = 08+)" = Sob (a) ie 
We now set y 
Xn = = 12 


and observe that s. 
Xn+1 = max( 2e, 1) (13) 


and, moreover, all Xn range in the set B= {1,A, ae }. 
With respect to the new measure P the sequence £ = (€n)n>1 is also a sequence 
of independent identically distributed (i.i.d.) random variables with 


B= Plen = 1) = Ele, =) 0A! = ap (14) 


and J a 
T= P(en = —1) = 507p) (15) 


We shall consider the sequence  (Xn)n>0 defined by recursive relations (13) under 
the assumption that Xo = x € E. Let P, be the probability distribution of this 
sequence, Then X = (Xn, Fn, P,) with z € E and Ên = 0(X0, X1,.-., Xn), 
n > 0, is a Markov sequence and, therefore, to find the price C one must consider 
the optimal stopping problem 

V(c) = sup E,67(X, —-a)*, cek, (16) 
TEMP 
where Mg is the class of finite stopping times r = T(w) such that {w: r(w) <n} € 
Gn, n>0 

The price Cin question is connected with the solution Va) of this problem by 

the formula 

€ = SoV(1). (17) 
Remark 1. Strictly speaking, the supremum in (7) is taken over the class 9R? and 
therefore formula (17) is valid if, in the definition (16) of V(x), we consider the 
supremum over the wider class MgO in place of ME. However, these two suprema 


are the same, which follows from the general theory of optimal stopping rules for 
Markov sequences (see [441]) and is in effect proved below (see Remark 2). 
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3. Let g(x) = (£ - a)t, x € B, and let 


VN (z) = sup E87 9(Xr), 
remy 


whcre MN is the class of stopping times 7 in Me such that r(w) < N, w € 2 (see 
Fig. 61). 


> 
0 a 1 r E A3 At raaf 


FIGURE 61. Graphs of functions g(x) = (x — a)t and 
VN (x) = ON g(x) fr0<a<1 


We also set 


P f(a) = Ex f(X) = PFF A 1) +(1-B)FO2) (18) 


and 


Qf (x) = max(f (2), BT f(c)). (19) 


By Theorem 3 in § 2a and our remark upon it, 
PN (2) = ÔN g(a) (20) 


and the optimal stopping time cd E€ MN can be described as follows (cf. (9) in § 5b): 


FN = min{0 Sn IN: Xn€ DN ; (21) 
where > es 
DN = {re BE: VN-"(z) = 9(z)}. (22) 


Clearly, DY € DN C-C DN == {1, à, à2,...} 

Iu the same way as in §5b, considering successively the functions 
Qg(z),.-.,QN g(2) and comparing them with g(x) we see that the stopping 
domains DN are as follows: 


DN = {z€ E: z € [2], o)}, (23) 
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where 
ia Var NOM ee arr (24) 


The qualitative picture of the stopping domains DN and the continuation do- 
mains CN = F\ DN is as in Fig. 57 in § 5b (with the self-evident change of notation 
Si > X; and E > E,...,2N =0 > N =1). 


Remark 2. If 


vò (x) = sup E26" 9(X-), 
remy 


then it follows by Theorem 3 in § 2a that VN (£) = QN g(a). Comparing this 
with (20) we see that VN (2) = VÝ (z), z € E, and that the instant FA defined 
by (21) is optimal not only in the class my, but also in the broader class MY. 
4. Since g(x) > 0, it follows by Theorem 4 in § 2b that V(x) = Nim. VN (£). We 
now set 
F = inf{n > 0: V(Xn) = g(Xn)} = inf{n > 0: Xn € D}, 
where D = {x € E: x €{z,00)} and @= lim cae 
N-00 

By the same theorem, 7 is an optimal stopping time in the problem (16), pro- 
vided that P.(7 < 00) =1,2€ Ê. Leaving aside for the moment this property of 7 
we proceed to ĉ and P (z). 

The function P(x) satisfies the equation 


V(x) = max(g(x), BTV (z)), ce b, (25) 
therefore, in the continuation domain C = E\ D it is a solution of the equation 
(z) = BT o(cx), rec, (26) 
or, in more explicitly, 
A T ~ A 
olz) = 6[Be(5 v1) + (1 -= P)pAs)], TEC. (27) 
In particular, for z = 1 we have 


(1) = Bip) + 0 - P)e(A)], (28) 


while for « > À, 
2. 


ple) = Bip e(S) +0 -P)eQra)]. (29) 


630 Chapter VI. Theory of Pricing. Discrete Time 


It would be natural to seek a solution of (29) in the form of a power function x7 
(cf. §5b.5). Then we obtain for y the equation 


BaBx T+ (1-9), (30) 


with two solutions, y1 < 0 and y2 > 1, such that the quantities yj = Av and 
y2 = A? are defined by the formulas 


A2 
Ws By 


A 
y= D +\/ — -B (31) 
with ; 

A SS eeen and B= = 32 
G-A 1-5 ea 
For x > À the general solution y(x) of (29) can be represented as cyp (x), where 
p(T) = br” + (1 — b)r?. 
Since (1) = 1, it follows that c = (1). 
Substituting (à) = y(1)y,(A) in (28) and bearing in mind that y(1) 4 0 due 
to the nature of the problem, we obtain the following equation for the unknown b: 


1=p{f+ a - P) [bam gis br] b (33) 


It has the solution 
(1—p)dA? +P- po! 


SO = BOR =) 


(34) 
Since yı and yz can be determined from (30), it easily follows that 0 < bee 
Let Vo,(z) = covp(z) for £ < x9, where cg and zo are some constants to be 


determined. Clearly, the required function V(x) belongs to the family of functions 


(x = a)", T> T0, 


Palazo) = { (35) 


Veg (2); T < T0. 


Here the ‘optimal’ values € and Ẹ of co and zo can be found by means of the 
following considerations: the required function F(z) = V (a; T) must be the smallest 
B-excessive majorant of g(x), i.e., the smallest function satisfying simultaneously 
the two inequalities 


) > 
) > pTV a 


for each z € Ê = {1, À, d2,...}. 
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The solubility of this problem can be proved and the exact values of € and & 
can be found in precisely the same manner as for a standard call option (see § 5b.6 
and [283]). Namely, if A = A — 1 is close to zero, then we can consider approxima- 
tions č and & of ¢ and F obtained as follows (cf. a similar procedure in § § 5b, c.) 

We shall assume that the functions ~(z), Veg (2); g(x), and Ve (z; zo) on 
B= {1,\,A?,...} are defined by the same formulas on [1, 00). 

Then the approximations c and F can be found from the additional conditions 


Va) = 9(2), 


S 37 
dPz(a)| _ dg(z) 87) 
dx Lf dz xr=7+ 


Bearing in mind that 
Pelz) = čy(T) = é[ ba + (1 - b)x%], 


g(x) = (x — a)*, and, for sure, Z > a, we see that € and & are the solutions of the 
system of equations 


é[bz™ + (1-b)F”] =z- a, 


R a (38) 
[by FM) + (1 —b)qyoFP 71] = 1. 
In particular, for a = 0 we have 
b 1 mm 
T o 
1-b 72-1 
z= Š . (40) 


YEN + y2(1 — b) T2 


The above considerations show that for positive, sufficiently small A the quantity 
Vz(1) is close to V(1). Hence, in view of (17) and the equality V=(1) = T, we obtain 
that Č x So -é for small A > 0. (See [283] for a more detailed analysis; cf. also 
Chapter VIII, § 2d.) 
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1. Investment Portfolio in Semimartingale Models 


§ la. Admissible Strategies. Self-Financing. 
Stochastic Vector Integral 


In this section we shall consider models of two securities markets with continuous 
time: 
a (B,S) market formed by a bank account B and stock S = (S,...,S%) of finitely 
many kinds, 
and 
a (B,P) market formed also by a bank account B and, in general, a continual 
family of bonds P = {P(t,T);0 <t<T, T > O}. 
In §§1-4 we are concerned with (B,S)-markets. A (B,P)-market has some 
peculiarities and we defer its discussion to § 5. 


1. We consider a financial market of d+ 1 assets X = (X°,X!,...,X4%) that 
operates in uncertain conditions of the probabilistic character described by a filtered 
probability space (stochastic basis) (Q, F, (Ft)tz0, P), where (Ft)tzo is the flow of 
incoming ‘information’. 

Our main assumption about the assets X? = (X#)e>0, i = 0,1,...,d, is that 
they are positive semimartingales (see Chapter III, § 5a). 

By analogy with the discrete-time case we call an arbitrary predictable 


(see Chapter III, §5a) (d + 1)-dimensional process + = (79,7!,...,2%), where 
n= ()¢>0, an investment portfolio, and we shall say that m describes the strategy 
(of an investor, trader, ...) in the above market. 
The process X7 = (X7 )tz0, where 
d . 
xf = 5 Xi, (1) 


17=0 
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or, in the vector notation, 
XF = (mt, Xt), (2) 


is called the value (or the value process) of the portfolio m. The value z = X¢ is 
the initial capital, which one emphasizes sometimes by the notation X7 = X" (x). 


2. In Chapter V, § la, in the discussion of the discrete-time case we introduced 
the concept of self-financing strategy n and explained the role of these strategies, 
in which all changes of the value X7 must be the results of changes in the market 
value (price) of the assets Xt and no in- or outflows of capital are possible. 

The definition of self-financing becomes slightly more delicate in the continuous- 
time case. In the final analysis this is related to the problem of the description of 
the class of integrable functions with respect to the semimartingales in question. 

We recall that in the discrete-time case (see Chapter V, §1la) we say that a 


portfolio m = (79,r1,...,7%) is self-financing (r € SF) if for each n > 1 we have 
n 
XR = XG + X (Tk, AX;), (3) 
k=1 


or, in a more expanded form, 
n d : ; 
Xn = XP +Y Y LAX. (4) 
k=1i=0 


In the same way, a reasonable definition of a self-financing protfolio or a self- 
financing strategy n (we shall write m € SF) in the continuous-time case could be 
the equality 


t 
Xf = X3 +f (rs. dX) (5) 
0 


for each t > 0. Equivalently, 
t d . . 
xp = xg f Sor, dX}, (6) 
oi 
i=0 


which we can symbolically express as follows: 
dXf = (m, dXt). 


Of course, we must first define the ‘stochastic vector integrals’ in (5). 
One way is simply to set 


t d mt. : 
J saxa =y | mi dx? (7) 
i i=0 "0 
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by definition, i.e., to treat a ‘stochastic vector integral’ as the sum of usual ‘sto- 
chastic integrals’. 

This is quite sensible (and we shall use this definition) in the case of ‘simple’ 
functions; in fact, this is the most natural (if not the unique) construction suggested 
by the term ‘integration’. 

It turns out, however, that the definition (7) does not cover all cases when 
a ‘stochastic vector integral’ ff (rs, dX.) can be defined, e.g., as a limit of some 
integrals of ‘simple’ processes m(n) = (75(”))s>0, n > 1, approximating 7 = (75)5>0 
in some suitable sense. 

The point here is as follows. Pap , 

First, even the usual (scalar) stochastic integrals 1’. X} = i. n dX; can be 


defined for a broader class of predictable processes n? than the locally bounded 
ones discussed in our exposition in Chapter III, §5a. (What makes locally bounded 
processes 1! attractive is the following feature: if Xt € Moc, then the stochastic 
integral z? - X* is also in the class hoc; see property (c) in Chapter III, § 5a.7.) 

Second, the ‘component-wise’ definition (7) does not take into account the pos- 
sible ‘interference’ of the semimartingales involved; in principle, this interference 
can extend the class of vector-valued processes 7 = (x9, rt, we ri) that can be 
approximated by ‘simple’ processes m(n), n > 1. 


3. We explain now the main ideas and results of ‘stochastic vector integration’ that 
takes these points into consideration; we refer to special literature for detail (see, 
e.g., [74], [172], [248; Chapter II], [249], [250], [303], [304], or [347]). 
Let X = (X!,...,X4) be a d-dimensional semimartingale admitting a decom- 
position 
X= Xj+A+M, (8) 


where A = (A!,..., A%) is a process of bounded variation and M = (M!,..., M4) 
is a local martingale (A € Y and M € MQ). 

Clearly, we can find a nondecreasing adapted (to the flow F = (.¥;)¢30) process 
C = (Ct)tz0, Co = 0, and adapted processes d= (ch) and ci = (c) ij =1,...,d, 
such that 


: t., 
Ai= | Cde dow (9) 
0 
while the quadratic variations satisfy the equality 


; i bo. 
(Mi, Mj], = | cli dCs. (10) 
0 


(As regards the definition of the processes [M*, MI] and their property [M*, M?]!/? 
€ Soc, see § 5b, Chapter II.) 
Let m = (7!,..., 7f) be a predictable process. 
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We shall say that 
T € Dyar(A), (11) 


if (for each w € 2) 


Ha 
J S nicl] dCs <œ, 4 >0. (12) 
0 i=l 
We also write 
me Li (M) (13) 
(q > 1) if 
q/2 

(= ricin i). c| E thoes 


tjel 


i.e., if there exists a sequence of Markov times Tn approaching oo as n — œ such 


that 
Ia is midir i] ac,| i < Oo: (14) 


i j=l 


If there exists a representation X = Xo + A+ M such that a predictable process 
m belongs to the class Lvar(4) N L? (M), then we write 


ne LI(X) (15) 
(or also r € LI(X; P, F), if we must emphasize the role of the underlying measure P 
and the flow F = (Ft)t>0)- 
Importantly, the fact that m belongs to L?(X) is independent of the choice of 
the dominating process C = (Ct)tzo (see, e.g., [249]). 
Both in the scalar and in the vector cases one standard definition of the sto- 
chastic integral J (ns, dXs), t > 0. for r € LI(X) consists in setting 


t t t 
f cs.ax.) = f (modas) + f (rs, dMs), (16) 
0 0 0 
where 
t adopt 
J (Ts,dAs) = D nsc, dCs (17) 
0 i=1 70 
is the sum of (trajectory-wise) Lebesgue-Stieltjes integrals and 


f sam (18) 
J0 


is the stochastic integral with respect to the local martingale M = (M!,...,M%). 
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The definition of Lebesgue- Stieltjes integrals with respect to a process of 
bounded variation (for 7 € Lyar(A) and arbitrary w € 2) encounters no difficulties. 
The main problem lere is 
a) to give a definition of tle (vector) integrals (18) with respect to a local 
martingale M (for r € L} (M)) 


and 
b) to prove the consistency of the definition (16) or, in other words, to show 
> ; t ; 
that the values of the resulting integrals J (Ts, dXs) are independent of a 
particular semimartingale decomposition (8). 


Remark 1. There are several pitfalls in the ‘natural’ definition (16). 
First, one does not automatically get, say, the property of linearity 


t 
of m,,dXs5) +b Ka aX.) = || (ary + brt, dXs). 


It is not a priori clear whether the integrability withstands the replacement of 
the measure P by some equivalent measure P, i.e., whether LI(X; P, E) = L9(X; P , EF) 
and whether the values of the corresponding integrals are the same (if only P- aai: 

Neither is it clear whether this definition is invariant under a reduction of the 
flow of o-algebras F = (Ft)tp0o. Namely, assume that X is a F-semimartingale such 
that the X; are 4-measurable, where G = (“%)+>0 is a flow of o-algebras satisfying 
general conditions (Chapter III, § 5a) and @ C Fi, t > 0. It is well known (see, 
e.g., [249] and [250]) that X is also a G-semimartingale in that case. Hence one 
would anticipate that if a process 7 is G-adapted, then 


n € L9(X;P,F) => rE L9(X:P,G) 


and the value of the stochastic integral is independent of the particular stochastic 
basis, (Q, F, F) or (Q, ¥,G), underlying the processes m and X. 

It is shown in [74], [248], and [249] that all these properties hold nicely (for each 
q > 1). 
4. We describe now the construction of the integrals (18) with respect to a local 
martingale M for m € Li, (M). 

If 7 € L?(M), i.e., 


Imran = [Ef (So rid Pn) ) ac, 


i j=l 


1/2 
< œ, (19) 


t 
then the stochastic integral (m - M) = J (ms, dMs) can be defined as in the scalar 
case for square integrable martingales (see Chapter III, § 5a.4). 
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Namely, we find first a sequence of simple predictable vector processes m(n) = 
(x1(n),...,44(n)), n > 1, such that 


lr — r(n)ilrzm) 70 as n> ow. (20) 


For these processes m(n) the integrals (a(n) - M) can be defined component-wise 
by formula (7). 

By the Burkholder-Gundy-Davis inequality (see, e.g., [248; 2.34] or [304; Chap- 
ter 1, §9]), 


2 


Esi < Collr(n)llzz(m)>  t>0, (21) 
us 


ia (s(n), dMs) 


with some universal constant C2. 
We conclude from (20) and (21) that 


2 
Esup >0 as m,n> œ. 


uct 


f (rs(n) — Ts(m), dMs) 
0 


Since the L?-variables make up a complete space, there exist for each t > 0 
t 
a random variable, denoted by (7-M); or $ (ms,dMs) and called the stochastic 
vector integral of r € L? (M) with respect to the local martingale M, such that 


f, (s(n), am) 5 [ ams) 


It is easy to see (cf. the proof of Theorem 4.40 in (250; Chapter I] that we can 


: t t a 
choose the variables I, (ns, dMs), t > 0, such that the process (J (Ts, aM,)) is 


adapted to the flow F = (¥¢)¢50 and has right-continuous trajectories with limits 
from the left for each t > 0. 

Using the standard localization trick we can extend these definitions for 
m € L?(M) to the class of predictable processes m € L? (M), i.e., to processes 
such that (14) holds with q = 2. 

It is much more complicated to construct stochastic integrals with respect to 
local martingales M for processes m in the class LL (M), i.e., for processes such 


. ( 


Even in the scalar case (d = 1), where this condition has the simple form 


2 
Esup >0 as n> œ. 


uct 


loc 


1/2 
Sard eis 


j=l 


(7? -[M,M})/? € Aoc: 
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the construction of the stochastic integral m - M involves some refined techniques 
based on some properties of local martingales that are far from trivial (see [304; 
Chapter 2, § 2]). 


Remark 2. It is worth noting in connection with the condition (7? - [M, M])!/2 € 
Soc that it definitely holds for locally bounded processes m since, as observed 
earlier, each local martingale M has the property [M, M]!/2 E Sioc- 

As regards different definitions of stochastic vector integrals (m - M) = 


t 
n (Ts, dMs) with respect to local martingales M and for 7 € Lioc(M) (= LL. (M) 


= “loc 
t : ; 
or of stochastic integrals (m - Xj = $ (ns, dXs) with respect to semimartingales 


X for n € L(X) (= L'(X)), see [74], [249], and [250], where the authors establish 
the following, well-anticipated properties (here X and Y are semimartingales): 
a) if p is a predictable bounded process and m € L(X), then pr € L(X), 
pE L(t- X) and (pr): X =p- (T-X); 
b) if X € Moc, then Lige(X) € L(X); 
c) if X € Y, then Lyar(X) C L(X); 
d) L(X)N L(Y) C L(X +Y), and if r € L(X) A L(Y), then r: X +r- Y = 
m: (X +Y); 
e) L(X) is a vector space and n’- X +7” -X = (n +7r”)- X. for n’, n” € L(X). 
The question whether L(X) is the maximal class of processes 7 where a)-e) hold 
is discussed in [249] and [250]. Another argument in favor of its ‘maximality’ is the 
following result of J. Mémin [343]: the space {r - X: m € L(X)} is closed in the 
space of semimartingales with respect to the Emery topology ([74], [138]). 


5. If the process 7 = (r!,..., 7%) is locally bounded and M € Migc, then the sto- 
t 
chastic vector integral process Ch (s,dMs)) a is a local martingale ((74], [249]). 


2 


In the scalar case we mentioned this property in Chapter II, §5a.7 (property (c)). 
It is worth noting that if there is no local boundedness, then the stochastic 
t . ; 
integral J ms dMs with respect to a local martingale M is not, in general, a local 
martingale even in the scalar case, as shown by the following example. 


EXAMPLE (M. Emery [137]). We consider two independent stopping times ø and r 
on a probability space (Q, F, P) that have exponential distribution with parameter 
one. We set 
0, t<min(o,7), 
M = 1, t>min(o,T) =o, (22) 


~~ 
-1, t>min(o,T) =r. 


If Fe = o(Ms,s <S t), t > 0, then it is easy to see that M = (Mt, Ft, P) is a 
martingale. 

We consider a deterministic (and therefore, predictable) process m = (m¢)t>0 
with ro = 0 and m = 1/t for t > 0. 
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The martingale M is a process of bounded variation and 7 is integrable with re- 
spect to M (in the Lebesgue—Stieltjes sense) because the random variable min(o, T) 
is strictly positive with probability one. 

t 

By property b) in subsection 4, the integral T ns dMs with respect to M coin- 

cides with the Lebesgue-Stieltjes integral. 
t 
Let Y; = I m,dM,, t > 0. By (22) we obtain 


0, t < min(o,7), 
af a t 2 in > =O, 
y= min(g, T) Pe a) 2 (23) 
1 ; 
-———~., t > min(o,r) =7. 
min(o,T) 


The process Y = (¥;)¢50 is not a martingale because E|Y;| = œ, t > 0. Neither is 
this process a local martingale because E{Yp| = oo for each stopping time T = T(w), 
not identically equal to zero (see [137] for greater detail). 

6. In connection with M. Emery’s example there arises a natural question on con- 
t 

ditions ensuring that a stochastic vector integral ( ik (Ts, dX iso is a local mar- 

tingale if so is the process X. We have already mentioned one such condition: the 

local boundedness of 7. 

The following result of J.-P. Ansel and C. Stricker [9; Corollaire 3.5]) gives one 
conditions in terms of the values of the stochastic integral itself, rather then in 
terms of m (this is convenient in discussions of arbitrage, as we shall see below). 
THEOREM ((9]). Let X = (X!,...,X% be a P-local martingale and let + = 
(x!,..., 724) be a predictable process such that the stochastic integral m - X is well 
defined and bounded below by a constant (7-X; > C, t > 0). Then r- X is a local 
martingale. 

7. We now return to the issue of self-financing strategies. 

DEFINITION 1. Let X = (X°, X!,..., X%) be a (d + 1)-dimensional nonnegative 
semimartingale describing the prices of d+1 kinds of assets. We say that a strategy 
m = (n°, x}, ...,2%) is admissible (relative to X) if r € L(X). 

DEFINITION 2. An admissible strategy m € L(X) is said to be self-financing if its 
value X7 = (Xf )tz0 defined by (1) has a representation (5). 

We denote the class of self-financing strategies by SF or SF(X); cf. Chapter V, 
§ la. 

For discrete time, self-financing (see (3) or (4)) is equivalent to the relation 

d - . 
5 CG 4 Ari) =0, 4h S1, 
7=0 
(formula (13) in Chapter V, § 1a). 
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We consider now the issue of possible analogs of this relation in the continuous- 
time case, under the assumption (7). 

To this end we distinguish the strategies 7 = (n°, Wiesia rt) whose components 
are (predictable) processes of bounded variation (r € VY, i = 0,1,...,d). One 
example of such strategies are simple functions that we chose to be the starting 
point in our construction of stochastic integrals (Chapter III, § 5a). 

Under the above-mentioned assumption (7? € V) we obtain 


d(niX}) = nt dX} + Xi_ dni 


(see property b) in Chapter III, § 5b.4), so that self-financing condition (5) is equiv- 
alent to the relation 


dat 
D, X't_dri'=0, t>0, (24) 
i=0 "0 
or, symbolically, 
(Xt—. dme) = 0. (25) 
In a more general case where 7 = (x®, Tl., T) is a predicrtable semimartin- 


gale we obtain (using Itô’s formula in Chapter III, § 5c) 


d(ni X$) = ri dX} + X}_ dni + d[X*, n°} 
= mi dX} — Ant dX} + Xt dri + d(X™, n") + Ani AX} 


= mi dX} + Xt_ dri + d(x®,1%),. (26) 


Thus, the condition of self-financing of a semimartingale (predictable) strategy 
m assumes the following form: 


d 
XO Xi dri + d(x, n°), = 0. (27) 
i=0 


In particular, if r € Y, then (X*, t°) = 0 and we derive (25) from (27). 


8. The main reason why one mostly restricts oneself to the class of semimartingales 
in the consideration of continuous-time models of financial mathematics lies in 
the fact that (as we see) it is in this class that we can define stochastic (vector) 
integrals (which is instrumental in the description of the evolution of capital) and 
self-financing strategies. (This factor has been explicitly pointed out by J. Harrison, 
D. Kreps, and S. Pliska [214], [215], who were the first to focus on the role of 
semimartingales and their stochastic calculus in asset pricing.) 

This does not mean at all that semimartingales are ‘the utmost point’. We can 
define stochastic integrals for many processes that are not semimartingales, e.g., 
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for a fractional Brownian motion and, more generally, for a broad class of Gaussian 
processes. Of course, the problem of functions that can be under the integral sign 
should be considered separately in these cases. 

We recall again that in the case of (scalar) semimartingales stochastic integrals 
are defined for all locally bonunded predictable processes (Chapter III, §5a). It is 
important here (in particular, for financial calculations) that the stochastic integrals 
of such functions are also local martingales. 

The situation becomes much more complicated, however, if we consider locally 
unbounded functions. M. Emery’s example above shows that if m is not locally 


bounded, then the integral process fi (Ts,dMs) (even with respect to a martin- 
gale M) is not a local martingale in general. 


This is in sharp contrast with the discrete-time case where each ‘martingale 


transform’ >> (mk, A Mp) is a local martingale. (See the theorem in Chapter II, § 1c). 
ke 
The following definitions seem appropriate in this context. 


DEFINITION 3. Let X = (Xt, Ft, P)tz0 be a semimartingale. Then X is called 
a martingale transform of order d (X € MT*) if there exists a martingale 
M = (M!,...,M®%) and a predictable process m = (m!,..., 74) € Lioc( M) such 
that 


t 
X= Xo+ | (ms,dM,), t>0 (28) 
0 


(cf. Definition 7 in Chapter II, § 1c). 


DEFINITION 4. Let X = (Xt, Ft, P)t>0 be a semimartingale. Then X is called a 


local martingale transform of order d (X € MTE .) if there exists a local martingale 


M = (M!,...,M4) and a predictable process m = (x!,....44) € LL (M) such 
that the representation (28) holds. 


For discrete time the classes Moc, MTI, and MTË, are the same for each 
d > 1 (see the theorern in Chapter II, §1c). This is no longer so in the general 
continuous-time case, as shows M. Emery’s example. 


9. The above-introduced concept of self-financing, which characterizes markets 
without in- or outflows of capital, is one possible form of financial constraints on 
portfolio and transactions in securities markets. In Chapter V, § 1a we considered 
other kinds of constraints in the discrete-time case. 

Such balance conditions can be almost mechanically transferred to the continu- 
ous-time case. 

For instance, if X? = B is a bank account and X! = S is a stock paying 
dividends, then the balance conditions can be, by analogy with Chapter V, §1a.4, 
written as follows: 


aX? = bi dBi + y% (dS; + dD), (29) 
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where D; is the total dividend yield (of one share) on the time interval [0,¢]. In 


that case wwe È an 
al ŽE) =~ laf St T) 
(a) v( (F)+ Bi o 


The conditions in the cases involving ‘consumption’ and ‘operating expense’ 
can also be appropriately reformulated (see formulas (25)-(35) and (36)-(40) in 
Chapter V, § 1a). 


§ 1b. Discounting Processes 


1. Comparing the values (prices) of different assets one usually distinguishes a 
‘standard’, ‘basis’ asset and valuates other assets in its terms. For instance, the 
discussion of the S&P500 market of 500 different stocks (see Chapter I, §1b or, 
e.g., [310] for greater detail) it is natural to take the S&P500 Index, a (weighted) 
average of these 500 assets, for such a basis. 

In Chapter I (see §2c) we exposed briefly a popular CAPM pricing model, in 
which one usually takes a bank account (a riskless asset) for a basis, and the ‘quality’ 
and ‘riskiness’ of various assets A are measured in terms of their ‘betas’ 6(A). 

In our considerations of d + 1 assets X°, X!,..., X? we shall agree to choose 
one of them, say, xo as the basis asset. It is usually the asset that has the ‘most 
simple’ structure. It should be pointed out, however, that, in principle, we could 
choose an arbitrary process Y = (Y+, Ft)t>0 to play that role, as long as it is strictly 
positive. 

There exist also purely ‘analytic’ criteria for our choice of Y; namely, if such a 
discounting process (or ‘numéraire’ as it is often called; see, e.g., [175]) is. suitably 

T 


chosen, then the process —— is sometimes more easy to manage than X” itself; see 
the remark at the end of Chapter V, § 2a. 


2. If Y = (Yi, Ft)tz0 is a positive process defined on the stochastic basis 
(Q, F, (Ft)ez0, P) in addition to X = (X°, X1,..., X%), then for i = 0,1,...,d we 
set , 
si X =n X" 
X= X =L., 1 
Y ? Y ( ) 
If 7 is a self-financing portfolio (relative to X), then one would like to know whether 


it is self-financing with respect to the discounted portfolio X = (X, x, ee as 
To this end we assume that we have property (7) in §la and y-l= y is a 
predictable process of bounded variation (Y7! € V). Then 

dx, = ¥,ldxi+ Xi ay, (2) 


and 
dX; =Y; tax? + XZ aye’. (3) 
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Hence we see from condition of self-financing (6) in § 1a that 
d d d 
Soni dX, =Y; Y mi dX} + > mixi) dY! 
i=0 i=0 i=0 
d 
=Y; !dXī + M rixi) aye", (4) 
i=0 


Assume that (P-a.s.) for t > 0 we have 


d 
SOYO |i AX} AY, "| < ov. (5) 


s<t i=0 


Then 


d d d 
( rixi) ayp! = 9 mixi) dyp! — (>: ni axi) AY! 
= 


a i=0 i=0 


and by (4) we obtain 


d . d } f 
So ridX, =Y; ' dX] + Xf dy, + - 6s ni axi) AY! 
i=0 i=0 


d 
=Y dX t Xay "+ (axr DD axi) Ayp! 
i=0 
=Y; aX] + Xf dyY,", (6) 


A ar ; d 5 
because dXf = } mj dX}, and AXf = } xj AX} by the properties of stochastic 
1=0 i=0 
integrals (see property f) in Chapter III, § 5a.7). 
By (3) and (6) we obtain 


d 
dXt =X mi dX, (7) 
1=0 
which means that the discounted portfolio X = (X A ei) has the self- 


financing property. 


Remark. We assume in the above proof that Y is a positive predictable process, 
Yle Y, and (5) holds. As regards other possible conditions ensuring the preser- 
vation of self-financing after discounting, see, for instance, [175]. 
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A classical example of a discounting process is a bank account B = (B¢)¢>9 with 


B; = Bo exp( f ' r(s) ds), (8) 


where r = (r(t))>o is some, stochastic in general, interest rate that is usually 
assumed to be a positive process. A bank account is a convenient ‘gauge’ for the 
assessment of the ‘quality’ of other assets, e.g., stock or bonds. 

3. Let Y! and Y? be two discounting assets and let ¢ € [0,7]. We assume that the 


P X. ; : : f 
discounted process yi 8a ((d + 1)-dimensional) martingale with respect to some 


measure P1 on (Q, Fr). 
We now find out when there exists a measure P? ~ P! such that the discounted 


process +5 is a martingale with respect to P? (cf. our discussion in Chapter V, § 4). 


y? a 
To this end we assume that the process yi is a (positive) martingale with 
respect to PŁ. 
For A € Fr we set 
Y? JY, 
A) = Epi( J 9 9 
P(A) = Epi (ta oh / 34): (9) 


Clearly, P? is a probability measure in (Q, Fr) and P? ~ P1. 
By Bayes’s formula (see (4) in Chapter V, § 3a), 


xi Xi y2 y, 
Epe (2t a) 7 ep (FF | 7) a 
Yr Yr Yr Y; 


(P?- and Pl-a.s.). (10) 


ey 
yI y2 y2 
P 
Hence the discounted process yZ is a martingale with respect to the measure P? 


2 
defined by (9), 
It should be taken into account that if fr is a Fr-measurable nonnegative 
random variable, then it follows from (9) and (10) (provided that Fo = {@,2}) 


that 
Yọ Epı (4) - Yf Ep2 (5) (11) 
Yr 


T 
and (P2-, P!-a.s.) 


fr 2 fr 
ep ( 2 a) = Ytp ( f #1). (12) 
Ył | Ye | 
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§1c. Admissible Strategies. Some Special Classes 


1. In accordance with Definition 2 in § la, the value X7 = (X/")t<r7 of an admissible 
strategy m € SF(X) can be represented as follows: 


t 
xP=XG+ odj ST (1) 
0 


t 
where i (ns, dXs) is a stochastic vector integral with respect to the (nonnegative) 
semimartingale X = Cx, Ke. x4), 
We shall assume in what follows that X° is positive (X? > 0, t < T) and take it 


: ‘ ; A 7 X},. 
as the discounting process. To avoid ‘fractional’ expressions (similar to Pi kind 
t 
we assume from the outset that x? = 1, ie., the original semimartingale X = 


(1,X!,..., X) is a (d + 1)-dimensional discounted asset. 


2. We shall now introduce several special classes of admissible strategies; their role 
will be completely revealed by our discussions of ‘martingale criteria’ of the absence 
of the opportunities for arbitrage (see §§ 4 and 5 below). 


DEFINITION 1. For each a > 0 we set 


Ma(X) = {r € SF(X): Xf > -a, t € [0,T]}. (2) 


The meaning of a-admissibility “Xf > —a, t € [0,T]” is perfectly clear: the 
quantity a > 0 is a bound (resulting from economic considerations) on the possible 
losses in the process of the implementation of the strategy 7. 

If a > 0, then the value X” can take negative values, which we interpret as 
borrowing (either borrowing from the bank account or short selling, say, stock) 

d 

If a = 0, then the full value Xf = $ mix} must remain nonnegative for all 
O<t<T. = 

The classes IIg(X), a > 0, were introduced already in the first papers ((214], 
[215]) devoted to arbitrage theory; later, they were regarded as the most natural 
classes of strategies where (by contrast to the well-known ‘St.-Petersburg game’; 
see, e.g. (186; 2nd ed.]) the investor is not allowed to ‘double his stake after a loss’ 
indefinitely long (cf. Example 2 in Chapter V, § 2b). 

In papers devoted to necessary and sufficient conditions of the absence 
of arbitrage one considers mainly the classes IIg(X), a > 0, and some their 
generalizations. Here we must mention first of all several papers by F. Delbaen 
and W. Schachermayer (see, e.g., [100], [101] and historical and bibliographical 
references therein). 
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3. The classes IIg(X), a > 0, are by far not the only ‘natural’ classes of admissible 
strategies. 
The following definition is consistently used by C. A. Sin [447]. 


DEFINITION 2. Let g = (g°,g!,...,g%) be a (d + 1)-dimensional vector with non- 
re 
negative components and let g(X¢) = (g, X¢) = 5 SXi). 
i=0 
We set 
Tlg(X) = {r € SF(X): Xf > -9(X:), t € [0,T]}. (3) 


As in the case of Definition 1, condition Xf > ~—g(X;), t € [0, T], is transparent: 
at each instant t the quantity g(X;z) imposes a bound on the maximum possible 
losses or debt levels considered admissible in an ‘economy’ formed by funds g° in a 
bank account and g’ shares of each of d assets, i = 1,...,d. 

Clearly, if g° > a, then TIg(X) C M(X). 


4. For a discussion of various issues of arbitrage in semimartingale models it can 
be useful to introduce several classes of #y-measurable ‘test’ pay-off functions 


T 
w = (w) that can in principle be majorized by the returns J (ms,dX5) of a 
strategy 7 in one of the classes of admissible strategies introduced above. 


DEFINITION 3. For a > 0 let 
T 
Ya(X) = fv EL (Q, Fr, P): Y S i (Ts, dXs) for some strategy m € m00} 
0 
and 
T 
¥(X)= {¥ E Lo (Q, Fr, P): Y S J (ns, dXs) for some strategy r € m, 
0 
where H4 (X) = U la(X). 


a>0 


DEFINITION 4. For g = (g°,g!,..., 9%) with g > 0,i=0,1,...,d, we set 


T 
Y(X) = fu € Lg(Q, Fr,P): Y S f (Ts, dXs) for some strategy m € m0}, 
0 


where Lg(Q, Fr,P) is the set of Fr-measurable random variables y such that 
v| < 9(Xr). 
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5. As usual, we introduce the following norm in the space (Q, ¥7,P) of random 
variables y (although it would be more accurate to speak about equivalence classes 
of random variables; see, e.g., [439; Chapter II, § 10]): 


ivil = esssup |p| = inf{0 < c < co: P(|y| > c) = 0}. 


This makes Loo a complete (and therefore, Banach) space. 

We shall denote the closures of the sets U(X), a > 0, and Y4 (X) with respect 
to this norm ||: ||, by Wa(X) and V4(X), respectively. 

The space Y(X) is endowed with the norm ||- ||, defined by the formula 


Y 
g(Xr) I. 


Ile =| 


We denote the corresponding closure of Y(X) by Y(X). 


2. Semimartingale Models 
without Opportunities for Arbitrage. 
Completeness 


§2a. Concept of Absence of Arbitrage and Its Modifications 


1. In the case of discrete time (n < N < ov) and finitely many kinds of assets 
(d < œ) the extended version of the First fundamental theorem (Chapter V, § 2e) 
states that for (B, S)-markets 


ELMM <> EMM => NA. (1) 


Here NA is the No-Arbitrage property in the sense of Definition 2 in Chap- 
ter V, §2a. The properties EMM and ELMM mean the existence of an Equivalent 
Martingale Measure and an Equivalent Local Martingale Measure, respectively. 

Thus, if there is no arbitrage in the market in question, then the implica- 
tion NA => EMM means the existence of a martingale measure (P ~ P), which 
(as shown in the previous chapter) enables one to use the well-developed machinery 
of martingale theory in calculatious. 

On the other hand, if there exists at least one martingale measure in our model 
of a (B,S)-market, then the implication EMM => NA means that we have a ‘fair’ 
market (in the sense of the absence of opportunities for arbitrage). 

The first two implications => and <= in (1) are also or principal importance: 
they show that martingale and locally martingale measures are in fact the same in 
our case. 

Clearly, one would also like to have results of type (1) for the continuous-time 
case (if only for semimartingale models). However, the situation there is more 
complicated, although in essense, there is ‘absence of arbitrage’ (in the sense of an 
appropriate definition) if and only if there exists an equivalent measure with certain 
‘martingale’ properties (which we specify below). 
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It will be clear from what follows that for a satisfactory answer to the question 
of the validity of (1) we require different versions of the ‘absence of arbitrage’, 
depending, in the long run, on the classes of admissible strategies. 

We recall in this connection that one requires essentially no constraints on the 
strategies m = (3,7) to prove (1) in the discrete-time case (apart from the standard 
assumptions of predictability and self-financing). 

Continuous time is different: for a mere definition of self-financing we must 
use there stochastic vector integrals [inak and their existence requires the 
constraint of admissibility m € L(X) on (predictable) strategies 7. 

Of course, if we let into consideration only ‘simple’ strategies, finite linear com- 
binations of ‘elementary’ ones (Chapter III, §5a), then there are no ‘technical’ 
complications due to the definition of vector integrals (see § 1a). 

Unfortunately, in the continuous-time case we cannot in general deduce the 
existence of martingale measures of measures with some or other ‘martingale’ prop- 
erties from the absence of arbitrage in the class of ‘simple’ strategies. (The class of 
‘simple’ strategies is too ‘thin’!) 

2. We proceed now to main definitions related to the absence of arbitrage in semi- 
martingale models X = (1, xX), X3), X? = (Xfter, $= Tele 

The following concept is, in effect, classical (cf. Definition 2 in Chapter V, § 2a). 


DEFINITION 1. We say that the property NA holds at time T if for each strategy 
m € SF(X) with XẸ = 0 we have 


P(X >0)=1 = P(X =0)=1. (2) 
DEFINITION 2. We say that the properties NA, and NA+ hold if 
Va(X) NLS (Q, Fr, P) = {0} (3) 
or 
V4(X) LZ (Q, Fr, P) = {0}, (4) 


respectively, where V(X) and Y4(X) are as defined in §1c and L} (Q, ¥7,P) is 
the subset of nonnegative random variables in Lo(Q, Fr, P). 


It is easy to show that (4) is equivalent to the condition 
Wf (X) LE (Q, Fr,P) = {0}, (5) 
where 
T 
Vi (X) = fy E bool FrP): v= f (Ts, dXs) 
for some strategy 7 € mo} (6) 


DEFINITION 3. We say that the property NA, holds if 
W4(X) N LA (O, Fr, P) = {0}. (7) 
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The property NA+, which is a refinement of NA+, is consistently used by 
F. Delbaen and W. Schachermayer (see, e.g., [100], [101]), who call it the NFLVR- 
property: No Free Lunch with Vanishing Risk. 

This name can be explained as follows. 

In the discussion of the absence of arbitrage in its NA+-version we take for ‘test’ 
functions Y only nonnegative functions that are smaller or equal to the ‘returns’ 
A T (ng,dX,) from the strategy  € 11,(X). 

However, in our considerations of the NA4-version of the absence of arbitrage 
we can take (also nonnegative) ‘test’ functions Y € V4(X)NLE(Q, Fr,P), eg., 
ones that are the limits (in the norm ||: {|o.) of some sequences of elements 
pk (k > 1) of W(X), which can take, generally speaking, negative values 


X Ty 
(in particular, these can be some integrals J (rË, dXs)). 


Since |y- 4|l]oo 3 0 as k — œ, we can assume that Yë > —1/n (for allw € 9), 
which can be interpreted as vanishing risk. 

Remarkably, the absence of arbitrage in the NA¥4-version has a transparent 
(‘martingale’) criterion established in [101] (see Theorem 2 in § 2c). 


3. We present now versions of the concept of absence of arbitrage based on the use 
of strategies in the class II,(X). 


DEFINITION 4. Let g = (Chey eer we where gt > 0, i =0,1,...,d. We say that 
the properties NAg and NAg are satisfied if 


P(X) NA LAQ, Fr, P) = {0} 
or 

E(X) ALE (Q, Fr, P) = {0}, 
respectively. 


In [447], the property NAg is called the NFFLVR-property: No Feasible Free 
Lunch with Vanishing Risk. 


§2b. Martingale Criteria of the Absence of Arbitrage. 
Sufficient Conditions 


1. We shall assume that our financial market is formed by d+ 1 assets 
X = (i, oe sx); where X’ = (XPeer are nonnegative semimartingales on a 
filtered probability space (Q, F,(Fi)icr,P) with Fr = F, Fo = {2,9}. 

IfPisa probability measure on (Q, Fr) such that P~Pand X isa martingale 
with respect to this measure, i.e., 


X € MP) 
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or it is a local martingale, i.e., 
XE Mg (P), 
then we say that we have the property 


EMM 


or 
ELMM, 


respectively. 

The next theorem, which describes some sufficient conditions of the absence 
of arbitrage, is probably the most useful result of the theory of arbitrage in semi- 
martingale models from the standpoint of financial mathematics and engineering. 
THEOREM 1. In the semimartingale model X = (xe, i s XY, for each a > 0 
and g = (g?,g!,..., 92) with g' > 0, i = 0,1,...,d, we have 


ELMM => NAag, (1) 
EMM => NAg. (2) 


Proof. Let XT be the value of a strategy m € IIg(X): 
t 
Xf = Xp + | (Ts, dXs), t<ST. (3) 
Jo 


Assume that P is a martingale ıneasure equivalent to P. As mentioned in Re- 
mark 1 iu § la, the integrability of m with respect to X withstands the replacement 
of P by the equivalent measure P. Hence if m € Hg(X), then the stochastic vector 
integral in (3) is well-defined also with respect to P. 

The proof of the absence of arbitrage for the strategy m € IIg (X) with Xf = 0 
(in the sense of property NAg) is based on the demonstration of the supermartingale 
property of X7 with respect to P. 

Indeed, if the supermartingale property holds, then 


E5 XT < Es X6 = 0, (4) 
so that we immediately obtain the required relation X7% = 0 (P- and P-a.s.) from 


the condition XZ > 0 (P and P-a.s.). 


Thus, let us establish the P-supermartingale property of X7. 
If  € IIg(X), then (P- and P-a.s.) 


t 
xf =x5+ f (eed XG) > AG GN. (5) 
0 
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By the linearity of stochastic vector integrals from (5) we obtain 
t 
| s + 94X3) > -X8 ~ (9: Xo). (6) 
0 


The process X is, by definition, a martingale with respect to P, and since the 
stochastic vector integral in (6) is uniformly bounded below, it is a local martingale 
by a result of J.-P. Ansel and C. Stricker (see §1a.6), and therefore (by Fatou’s 
lemma), a supermartingale. 

Consequently, 


t 
X SKE +f (xs + 9,dX5) — (9, Xt — Xo), (7) 


where the stochastic integral is a P-supermartingale and (g, Xt — Xo)icr is a 
P-martingale. Hence for m € IIg(X) the process X is a supermartingale with 
respect to the measure P, which together with (4) proves the required assertion (2). 

To prove (1) it suffices to observe that, as follows by (6) with g = (a,0,...,0),a 
stochastic integral (with respect to a local martingale) is itself a local martingale. 
Since X? = 1, it follows that(g,X; — Xo) = 0 for g = (a,0,...,0). Hence the 
right-hand side of (7) is a P-local martingale, and the proof of (1) can be completed 
in a similar way to (2). 


COROLLARY. It follows from (1) that 


ELMM => NA... (8) 
2. Assertions (1), (2), and (8) can be improved in the following way. 
THEOREM 2. In the semimartingale model X = (1, Xl... gy X?) we have 

ELMM => NA, (9) 
and if g = (g°,g',...,9%) with g? > 0,i=0,1,...,d, then 

EMM => NAg. (10) 


Proof. Let w € W(X) with w > 0. Then there exists a sequence (b*) ps4 of 
functions in Y(X) such that 


vw) - ¥*)| 1 
g(Xr(w)) | k 
Without loss of generality we can assume that for all w € Q we have 
_ afk 
L Vw) - vw) 1 
k © g(Xr(w)) k 


>0 as k> oœ. 


lib - Y"llg = esssup 


and therefore 
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Since w* € W(X), there exists a strategy rë E IIg(X) such that 


T 
ue i: (=t, dX). (13) 


Together with (12) this brings us to the inequality 


T 
< | (z dX), (14) 
k 0 


which shows that, for the sequence of strategies (rF)k>1, the negative part of the 
returns described by stochastic integrals (the ‘risks’) approaches zero as k increases 
(‘vanishing risk’). 

Inequality (14) is clearly equivalent to the relation 


Xx i 
a <f (nt + 2,aXx,). (15) 


Since Iy- pr | < g(Xr)/k, we obtain in view of (13), (15), and Fatou’s lemma, that 


k 
— pj; k (9, Xr — Xo) ans a g 
= Eplim( yt + OAT XO) < Eplim( | (f+ £,ax,) 


, Tik, 9 
<limEp | (x +7,dXs) <0, (16) 


0 < Ey = EB limy” = Eş im(vt $ (ay a) 


where the last inequality is a consequence of the P-supermartingale property of the 


š t 
stochastic integrals A G + A dX), t<T. 


Hence P(Y = 0) Py 0) = 1, which proves the implication (10). 
To prove (9) we assume that Y € Y4(X) and y > 0. Then there exists a 
sequence of functions (Wat in Y4 (X) such that 


ly- yfl = esssup ilw) -ywl < = 0, (17) 


and moreover 


T 
wr < [ (nt, dX.) (18) 


for në € Ila (X) with some a, > 0. 
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By (17) and (18) we obtain 


1 Dg 
-=< | (nE,dXs). (19) 
k Jo 
Further, as in (16), we see that 
0 < Ezy = Es limy" = Ex lim y% 
STP P pn 
T 
<Eplim f (r$,dX,) 


T 
< limes f (rE, dXs) <0, 


where the last inequality is again a consequence of the P-supermartingale property 
. t A 
of the stochastic integrals h (rE, dXs), t < T. This completes the proof. 


§2c. Martingale Criteria of the Absence of Arbitrage. 
Necessary and Sufficient Conditions (a List of Results) 


1. In the present section we state several results concerning necessary and sufficient 
conditions of the absence of arbitrage (in one or another sense). 

We recall again the implications EMM <=> NA in the discrete-time case, which 
has become a prototype for various result about general semimartingale models. 

Moreover, if the implication EMM == NA is easy to prove, then the proof of 
the reverse implication EMM <= NA, where one must either suggest an explicit 
construction or prove the existence of a martingale measure, is based on designs 
that are far from simple (even in the seemingly simple case of discrete time!; see § 2 
in Chapter V). 

Little wonder, therefore, that the proof of the corresponding results in the 
continuous-time case (due mainly to F. Delbaen and W. Schachermayer [100], [101]) 
is fairly complicated and we restrict ourselves to a list of several interesting results, 
referring the reader to the indicated special literature for detail. 


THEOREM 1 ((100]). a) Let X = (1, X!,..., X%) be a semimartingale with bounded 
components. Then 


| EMM —= WA, | (1) 


b) Let X = (hx ekg x?) be a semimartingale with locally bounded compo- 
nents. Then 


| ELMM — WA, | (2) 
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2. To state the result concerning necessary and sufficient conditions of property 
NA, in general semimaryingale models let us introduce, following [101], the concept 
of o-martingale and o-martingale measure. 

To this end we recall that for discrete time each local martingale X is incidentally 
a martingale transform, i.e., X = Xo +y: M (see the theorem in Chapter II, § 1c), 
where y is a predictable sequence and M is a martingale. 

Analyzing the proof of this theorem in Chapter II, § 1c one observes that each 
local martingale X can be represented as a martingale transform X = Xo +y- M, 
where y is a positive predictable sequence. 

Bearing that in mind and following [101] we shall call a martingale transform 
with positive elements of y a o-martingale. Further, if there exists a measure PaP 
such that X is a o-martingale with respect to it, then we shall talk about the Eo MM 
property. 

Using these concepts, in the discrete-time case we can now put the First funda- 
mental theorem (Chapter V, § § 2b,c; see also (1) in §2a) in the following form: 


[EMM — ELMM 4> EoMM => NA.| (3) 


This form is useful since it suggests routes for generalizations of the First funda- 
mental theorem to continuous time. 

It was a substantial success of the authors of [101] that they have developed 
a clear perception of the following approach: to find necessary and sufficient con- 
ditions for the absence of arbitrage in general semimartingale models in its NA+- 
version one must turn to o-martingales and o-martingale measures. 

We now give the corresponding definitions. 


DEFINITION 1. We call a semimartingale X = (X1,..., Xe) aa-martingale if there 
exist a R¢-valued martingale M = (M:z)¢<r and an M-integrable predictable positive 
one-dimensional process y = (y)¢<77 Such that X = Xo +y- M. 


DEFINITION 2. If there exists a measure P ~ P, such that a semimartingale X is a 
o-martingale with respect to this measure, then we say that P is a semimartingale 
measure and the Eo MM-property holds. 


Remark 1. As already mentioned, the term ‘o-martingale’ has been introduced 
in {101]. Before that such processes had been called (see, e.g., [73] or [137]) senmi- 
martingales of the class (Um). We point out that o-martingales are special cases 
of martingale transforms (see Definition 3 in § 1a). 

The following results can be seen as the culmination of the process of search of 
necessary and sufficient conditions for the absence of opportunities to arbitrage in 
its NA, -version. 


THEOREM 2 ((101]). In general semimartingale models 


| EoMM <> NA} (4) 
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Remark 2. M. Emery’s example (§ 1a.5) shows that a o-martingale is not necessarily 
a local martingale. 

For a clearer insight into the connections between Theorems 1 and 2 and the 
corresponding results in the discrete-time case (see (3)) we reformulate them as 
follows. 


COROLLARY. In general semimartingale models X = (1X \1<iea where X? = 
(X hect i= 1,...,d, T < œ, we have 


| EMM = ELMM => EoMM => NA,| (5) 


For locally bounded semimartingales X = (1, X)1 ica with X? = (Xbtcr; 
i=1,...,d, T < œ, we have 


| EMM => ELMM 4> EoMM => NA, (6) 


For bounded semimartingales X = (1, X) ied with X? = (Xter, i=1,...,d, 


T < ©, we have 
EMM => ELMM => Es MM => NA}, (7) 


3. We consider now the question of necessary and sufficient conditions for the 
absence of arbitrage in its NAg-version. 


THEOREM 3 ((447]). In general semimartingale models X = (1,X!1,..., X$) with 
Xi = (XPtcr: i=1,...,d < œ, the condition NAg for g = (9°, gt, eG) with 
gi >0,i=0,1,...,d, is equivalent to EMM: 


EMM 4> Ng (8) 


We have already established the implication =». The proof of the reverse 
implication is based on the following considerations. 
Let X = (1, X1,..., x) be a semimartingale. Then, as shown in [447], X satis- 


fies condition NAg if and only if the discounted prices satisfy condition NA+. 


G(X) 


Since is a bounded semimartingale, it follows by (7) that there exists an 


g(X) 
equivalent measure, which proves (8). 
To complete our list of results we dwell on two counter-examples. 


EXAMPLE 1 (EMM # NA). We consider a (B, S)-market with By = 1 and S; = Wi, 
where W = (W:):>0 is a standard Wiener process (the linear Bachelier model; see 
Chapter VIII, § 1a). For a self-financing strategy 7 = ({,y) its value is 

t 


Xf = Pet weSe =x¢+/ Yu dSu- 


We set 7 = inf{t: S = 1} and yu = I(u < T). Then X7 = Xf + S+, so that if 
XG = 0, then X7 = 1 (P-a.s.). 
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Clearly, there exists in this case a martingale measure (namely, the Wiener 
measure); however, our choice of the self-financing strategy m = (8, y) with yu = 
I(u < T) shows that we have opportunities for arbitrage. 


EXAMPLE 2 (ELMM # NA, ELMM = NAj, but # NAg). Let XP = 1, t € [0,1], 
and let 


where 


and W = (Wr)t<1ı is a Wiener process. 


The process X! is a local martingale. Since firs dX} = 1 for ys = 1, it follows 
that we have arbitrage in its classical interpretation. Simultaneously, it follows 
from (2) that we have property NA+. As regards property NAg, it breaks down here: 
let us set (as in [447]) 7? = 1 and m} = —1. Then Xf =0, X7 =1~ X} > —g(Xt) 
with g(X;) = 1+ Xf and XT =1. 


4. All our previous discussions of arbitrage related to the case where the processes 
describing the evolution of assets were semimartingales. It would be natural to 
consider now the issues of existence or nonexistence of opportunities for arbitrage 
also in the nonsemimartingale case. 

In the following two examples we consider (B, S)-markets in which the process 
S = (St)¢>0 is constructed from a fractional Brownian motion B™ = (BF)ty0 with 
parameter 5 < H < 1, which (as already mentioned in Chapter III, § 2c) is not 
a semimartingale, and therefore lacks a (local) martingale measure. This feature 
is an indirect indication (cf. Chapter I, §2f.4) that there may be opportunities 
for arbitrage in the corresponding ‘fractal’ models; and indeed, they occur in the 
examples below. 


EXAMPLE 3 (NELMM = A). Consider a (B,S)-market of the following linear 
structure: 
Be=1, S,=1+ BF, t > 0, (9) 


where BË = (BP )ty0 is a fractional Brownian motion with 4 < H < 1. (Cf. the 
linear Bachelier model in Chapter VIII, § 1a.) 
We consider the (Markov) strategy m = (Ø, y) with 


fe = —(BE)? - 2B", (10) 
ye = 2BF. (11) 
Then its value 


Xf = pi + St = -—(BF) — 2BF + 2BF(1 + BH) = (BP. 
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Using It6’s formula (22) in Chapter VIII, § 5c we obtain 
dXf = d(B#)? = 2B} d BË. 
From (9) and (11) we see that 
dX}? = yds. 
This means that the strategy m in question is self-financing. Since Xj = 0 and 


Xf = (BE)? > 0 for t > 0, it follows that there occurs arbitrage in this (B, S)- 
market (in the class of 0-admissible strategies) at each instant t > 0. 


From the financial point of view this is a fairly artificial example in which the 
prices S;, t > 0, can take negative values. Our next example does not have this 
deficiency. 


EXAMPLE 4 (NELMM = A). Consider a (B, S)-market such that 
dB, = rBy dt, Bo = 1, (12) 
dS; = S(r dt + odB¥),  So=1, (13) 


where BË = (BE )t>0 is again a fractional Brownian motion with l <H< l1. in 
view of formula (33) in Chapter III, §5c we obtain 


By =e", (14) 
Se = e" toBE. (15) 
We consider now the strategy m = (8, y) with 
By = 1 — 027 BE (16) 
yi = 2(e7F# — 1). (17) 


For this strategy we have 
E 
XT = By + yest = et (et — 1)’. 
Using Itô’s formula (32) in Chapter II, §5c we obtain 
dX = re"! (7 BE mi har ae Qoett+ 7B? (eo BE —1)dBF, 


and it is easy to see that the expression on the right-hand side is just the expression 
for 8, dB, + yedSt that can be obtained taking account of (14)-(17). 
Thus, 
aXf = 6, dBe + yt dst, 


which means that the strategy m defined by (16) and (17) is self-financing. 

Since for this strategy we also have Xj = 0 and Xf > 0 for t > 0, this model (as 
also the one in Example 3) leaves space for arbitrage (in the class of 0-admissible 
strategies) for each t > 0. 
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§2d. Completeness in Semimartingale Models 


1. By analogy with the terminology used in the discrete-time case (see Definition 4 
in Chapter V, §1b) we say that a semimartingale model X = (x, XX is 
complete (or T-complete) if each non-negative bounded ¥7-measurable contingent 
claim fr can be replicated, i.e., there exists an admissible self-financing portfolio 7 
such that X7 = fr (P-a.s.). 

Of course, this property of replicability depends on the class of self-financing 
strategies under consideration. 

Recall that by the Second fundamental theorem (Chapter V, §§4a,f) an arbi- 
trage-free model with discrete time n < N < œ and finitely many assets (d < 00) is 
complete if and only if the set of martingale measures consists of a single clement, 
a measure P equivalent to P. 


2. We present below one sufficient condition of completeness in general semimartin- 
gale models such that the corresponding class “7(P) of equivalent martingale mea- 
sures is non-empty. 

Under this assumption we have the following result. 


THEOREM. Assume that the set of martingale measures contains a unique mea- 


sure P. Then there exists a strategy ™ in the class SF(X) such that X% = fr 
(P-a.s.) for each pay-off function fp with Eslfr| < OO. 


Proof. This can be established in accordance with the following pattern (cf. the 
diagram in Chapter V, § 4a): 


1 2 
|PA(P)|=1 El ‘X-representability’ 2. completeness. 


Here the ‘X-representability’ with respect to a martingale measure Pe P (P) 
means that each martingale M = (Mt, Ft, Plier defined on the same filtered prob- 
ability space (Q, ¥,(Ft)tcr; P) as the P- martingale X has a representation 


t 
M,= Mo + | adk Her 
0 


where y € L(X) (cf. ‘S-representability’ in Chapter V, § 4b). 

The implication {1} follows from general results due to J. Jacod (see [248; Chap- 
ter I); for the purposes of arbitrage theory it has been stated for the first time by 
J. Harrison and S. Pliska [215]. 

The implication {2} can be proved as in the discrete-time case (see the proof of 
the lemma in Chapter V, § 4b). 


As regards particular examples of complete markets, see §§ 4 and 5 below. 


3. Remark 1. The issues of the ‘X-representability’ of (local) on incomplete arbi- 
trage-free markets have been considered, e.g., in [9]. 
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Remark 2. We now dwell on the question of the relations between the concepts of 
arbitrage, (locally) martingale measure, and completeness the reader comes across 
throughout the whole book. 

It should be emphasized that each of these concepts can be introduced inde- 
pendently of the others. Both arbitrage and completeness were defined in terms 
of the initial (‘physical’) probability measure P, and there was no word about a 
martingale measure. 

In the case of discrete times and finitely many assets (N < œ, d < œ) it turned 
out (the First fundamental theorem) that the absence of arbitrage has a simple 
equivalent characterization: the existence of a martingale measure (¥(P) # 2), 
while the completeness in such arbitrage-free models turned out (the Second fun- 
damental theorem) to be equivalent to the uniqueness of the martingale measure 
(JA(P)| = 1). 

However, if we take into consideration also more general models, then the ab- 
sence of arbitrage does not require the existence of martingale measures, as shown 
in Example 1 in Chapter VI, § 2b. 

In a similar way, the reader should not think (in connection with the Second 
fundamental theorem) that one can discuss completeness only when there are no 
opportunities for arbitrage or there exist martingale measures. It is formally quite 
possible that, e.g., 


a) there can be completeness both in arbitrage-free models and in models with 
arbitrage; 

b) there can be completeness in the conditions where a ‘classical’ (i-e., nonneg- 
ative) martingale measure exists, but is not unique; 

c) there can be completeness in the conditions where there exists no ‘classical’ 
martingale measure, but there exists a unique ‘nonclassical’ (i.c., of variable 
sign) martingale measure. 


3. Semimartingale and Martingale Measures 


§3a. Canonical Representation of Semimartingales. 
Random Measures. Triplets of Predictable Characteristics 


1. In the discrete-time case we discuss the canonical representation in Chapter II, 
§ 1b and Chapter V, § 3e. 

We recall the essential features of this representation. 

Let H = (Hn, Fn)nzo be a stochastic sequence, let hy = AHn (= Hn — Hn-1) 
for n > 1, and let g = g(x) be a bounded truncation function, i.e., a function with 
compact support equal to x near the origin (one often uses the function g(x) = 
z1(|t| < 1)). Then 


Hn = Ho + So hp = Ho + X. (he — g(he)) + XO g(ha), (1) 


k=1 k=1 k 


ll 
ma 


and therefore, since the functions g(h) are bounded, it follows from (1) by the 
Doob decomposition that 


n 
Hn = Ho + X E[g (hk) | Fr—1] 
k=1 


$ Di (he) ~ E(g(he) | Fr—1)] + X [hr - 9(he)]- (2) 


A € B(R\ {0}), 


Taking into consideration the jump measures pp(A) = La(hp 
ae hk EA | Fk), 


k > 1, and their compensators vg(A) = E(L4(he) | Fk- 


i 
ace 
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we see from (2) that 


Hn = Ho + 3 g(x) ve (dx) + > g(x)(uk(dz) — vk(dz) 
0 > k Day! (uk x (dz) 


+ z — g(x) (dz). (3) 
D f@-9) m 


In a similar way to Chapter V, §3e relation (3) can be rewritten as follows: 
H =g»xv+gx»(u—v)+(z-—g)*u. (4) 


We call (4) the canonical representation of the sequence H = (Hn, Fn)nz0- 

It is worth noting that for E|h,| < 00, k > 1, we still have the representation (4) 
if we take g(x) = x. In this case the Doob decomposition of the sequence H = 
(Hn, Fn)nzo has the following form: 


H=axv+rx(p-v). (5) 


2. We proceed now to a discussion of the canonical decomposition for semimartin- 
gales H = (Hi, Ft)tzo in the case of continuous time. 
Let g = g(x) be a truncation function. We set 


H(i = So[AHs ~ 9(AH,)]. (6) 


s<t 


Note that AH, — g(AH;) # 0, provided that |AH,| > b for some b > 0. Since for 


semimartingales we have S>(AHs)? < œ (P-a.s.) for each t > 0 (see (24) and (25) 
sgt 
in Chapter II, § 5b), the sums in (6) contain actually only finitely many terms and 


Vv 
H(q) is well defined as a process of bounded variation. 
The process 


H(g) = H ~ H(g) (7) 


has bounded jumps (|AH(g)| < b) and therefore it is a special semimartingale (see 
Chapter II, § 5b), i.e., it has a canonical decomposition 


H(g) = Ho + M(g) + Big), (8) 


where B(g) = (Bi(g), Ft)tz0 is a predictable process of bounded variation with 
Bo(g) =0 and M(g) = (Mz(g), Ft)t>0 is a local martingale with Mo(g) = 0. 
From (7) and (8) we see that 


H = Ho + M(g) + Big) + > [AHs ~ 9(AHs)}. (9) 
sx" 


This is a continuous analogue of (2). Now, to deduce an analog of (4) from (9) we 
shall need the concepts of random measure and its compensator. 
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3. Let (£,&) be a measurable space. 
DEFINITION 1. A random measure in Ry x E is a family 
u = {u(dt, dz;w); w E Q} 


of non-negative measures (R} x E, B(R+) © &) satisfying the condition 
u({O} x E;w) = 0 for each w € Q. 


EXAMPLE 1. A classical example of a random (and in addition, integer-valued) 
measure u is the Poisson measure defined as follows. 

For A € B(R4) © £ let 

m(A) = Ep(A;w), 
where m(A) is a o-finite (positive) measure. 

We assume that m(A) < œ for each A € (R4) @ € such that A C (t,o) x E, 
where t € R4 is arbitrary, and the random variable u(A, -) is independent of the 
particular o-algebra Fi. 

If the measure m (of intensivity) has the property m({t} x E) = 0 for each 
t € Ry, then we call u a Poisson measure. If, in addition, m(dt,dx) = dt F(dz), 
where F is a positive o-finite measure, then we call u a homogeneous Poisson 
measure. 

The attribute ‘Poisson’ can be explained by the following property of this mea- 
sure. 

Let (Aj)j>1 be a sequence of pairwise disjoint measurable subsets of Ry x Æ 
such that m(A;) < œ. Then the random variables (Aj), i > 1, are independent 
and u(4;) has Poisson distribution with expectation mm( Aj), i.e., 


e—™(Ai) (m(A;))* 


E k= 0,1,.... 


P(u(4;) = k) = 
(See [250; Chapter I, § 1c].) 


In Chapter III, §5a, we introduced two o-algebras, © and Y, of subsets of 
R4 x Q, which we called there the o-algebras of optional and predictable subsets. 

An important role in the applications of integer-valued random measures in 
R, x E is assigned to the o-algebras O = O © 6 and P = YE, which are also 
called the optional and the predictable o-algebras of subsets of R} x OQ x E. 

If W = W(t,w,z) is an optional function on Ry x Q x E and pu is a random 
measure, then we shall define by W * u = (wW * ujt (w), Ft)t>0 the random process 


(W x uj(w) = fogg” Ore Dm dasa), (10) 


where the integral is interpreted as the Lebesgue-Stieltjes integral for each w € Q 
and it is assumed that 


i ae |W(s,w,2x)| u(ds,dx;w) < œ, t > 0. (11) 
t] x 
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DEFINITION 2. We say that a random measure p is optional (predictable) if the 
process W * u is optional (predictable) for each optional (predictable) function 
W = Wt,w,2). 


DEFINITION 3. An optional measure p is said to be P-o -finite if there exists a 
#-measurable partitioning (An)n>1 of the set Ry x Q x E such that each variable 
(Iän * H)o is integrable. 


The next theorem is a direct generalization of Corollary 2 (Chapter I, § 5b) to 
the Doob—Meyer decomposition. 


THEOREM 1. Let u be an optional P-g-finite random measure. Then there exists a 
unique (up to P-indistinguishability) predictable random measure v, which is called 
the compensator of u, that satisfies each of the following two equivalent conditions: 
(a) E(W *v)oo = E(W * u)oo for each non-negative P-measurable function W 
on R} x Qx EB; 
(b) for each -measurable function W on Ry x Q x E such that |W | » p is a 
locally integrable process, the process |W | » v is also locally integrable and 
W »u— W xv is a local martingale. 


The proof, together with various properties of random measures and their com- 
pensators, can be found in [250; Chapter II] or [304; Chapter 3]. 


Remark. One obvious property of compensator measures rv is as follows. Let A € &. 
Then the process X = (Xt)t>0 with Xo = 0 and 


Xt = u((0, t] x A;w) — v((0,t] x Ajw), t> 0, 


is a local martingale. The measure p — v is called for that reason a (random) 
martingale measure. 


EXAMPLE 2. For a Poisson random measure yu introduced in Example 1 its com- 
pensator v is the intensity measure m. 


4. We discuss now an important concept of stochastic integral W x (w—v) of a 
P-measurable function W = W(t,w,x) with respect to the martingale measure 
u-v. 

If |W]| + u € ot, then |W] «xv € ott. by Theorem 1 and therefore it seems 
reasonable to set by definition 


W*x(u-—v) =W«xp—-W xv. (12) 


It is easy to show that the so defined process W » (u — v) = (W » (u —v)t) 
has the following two properties: 


t20 


a) it is a purely discontinuous local martingale (see Chapter IT, § 5b); 
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b) it makes the ‘jumps’ 
A(W *(u—v)), = We, (13) 


where 
Wi = PW eter) ult x dz;w) — fwen v({t} x dx;w). 


This properties suggests the following definition (cf. [250; Chapter II, § 1d] and 
[304; Chapter 3, §5]). 


DEFINITION 4. The stochastic integral W «(y—v) ofa -measurable function W = 
W(t,w,x) with respect to the martingale measure u — v is a purely discontinuous 
local martingale X = (X;)t>0 such that the processes AX = (AX¢)t>0 and W = 
(Wi)t>0 are indistinguishable. 


We have already seen that if the compensator v of a measure p has the property 
W| *v € CAS (or, equivalently, |W| * u € At) then we can take the process 
W «u—W *v as a purely discontinuous local martingale X. 


However, the condition |W| «v € che which ensures the existence of such a 


process X with AX = W can be loosened. 

Here we restrict ourselves to the introduction of the necessary notation and 
the statements of the corresponding results. We refer to the already mentioned 
books [250] and [304] for detail. 

Let 


at(w) = v({t} x E;w), 


qlw, B) = X. I(as(w) > 0)(1—asW)), Be A(R), 
sEB 


Wi(w) = f W (t,w,x)v({t} x dzjw). 
E 
We assume that for each finite Markov time 7(w) we have 


[\wee).w.2)|o(tr)} x dz;w) <00 (P-a.s.), 


and we set 3 = 
a 2 w2 

G(W) = UA uae xy 4 — xq. 
1+|W -W| 1+{W] 
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THEOREM 2. Let W = W(t,w,x) be a ®-measurable function such that 


G(W) ew 


loc’ 


(15) 


Theu there exists a unique (up to stochastic indistinguishability) purely discon- 
tinuous local martingale W » (u — v) such that 


A(W *(u—v)) =W. 


COROLLARY. Let W = 0. If 5 
+ 


ra E Gens 
1+ |W] loc 
then the stochastic integral W x (u — v) with respect to the martingale measure 


u-v is well defined (as a purely discontinuous local martingale such that 
A(W » (u -v)) = [W(t.w,2) u({t} x dz;w) i 
Remark. Let W = 0. Then 

[W + (u—v),W*(u—v)] = W? x u, (16) 


which is a consequence of the observation that 


[W + (u-v), W * (u-v) = X (AW * (u - v))s)? = (W? x we. 
O<s<t 


It clearly follows from the above equality (16) that for the predictable quadratic 
variation we have 


(W x (u =v), W *(u—v)) = W? xv. 


5. Special cases of random (moreover, integer-valued) measures are presented by 
the jump measures u” of processes H = (Ht, Ft)tz0 with right-continuous trajec- 
tories having limits from the left (in particular, of semimartingales): 


u” ((0,t]x Aww) = XO I4(AHs(w)), 
O<s<t 


where A € B(R \ {0}). 

Such a random measure u” in R} x E (with E = R \ {0}) is P-o-finite, and 
therefore, by the above theorem, the compensator v” of u” is well defined. 

We consider the canonical decomposition (9) again. 
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Using the random jump measure u% the last term on the right-hand side of (9) 
can be written as follows: 


> [AH — 9(AHs)] = (£ — 9(2)) » p”. 
ag“ 

As mentioned in Chapter III, §5b.6, each local martingale can be represented 
(and moreover, uniquely) as the sum of a continuous and a purely discontinuous 
local martingales. Hence the local martingale M (g) in (9) can be represented as 
follows: 

M(g)t = M (9)o + M(9)F + M(9)f, (17) 


where M(g)° is the continuous and M (g)? is the purely discontinuous component. 
The continuous local martingale M(g)° is in fact independent of g, and, as 
mentioned in Chapter III, §5b.6, is usually denoted by H°. 
As regards the purely discontinuous component M (g)4, which is a purely dis- 
continuous local martingale, it can be represented as follows: 


Moi= f gdu" - v"), (18) 
(0,t]xR 
To prove this we must verify that, first, G(g) € Aa and, second, the jumps 
of the local martingales on the right-hand and the left-hand sides of (18) are the 
same. It is proved in full in [250; Chapter II, § 2c] and [304; Chapter 3, §5]. Here 
we consider one special case. 
Assume that v ({t} x E;w) = 0; then 


G(g) = 9 «pil 
ETT 
and the local integrability of this process is a consequence of the fact that g = g(x) 
is a truncation function and of the inclusion (£? A 1) xv” € Ate holding for each 
semimartingale H. Hence the integral in (18) ìs well-defined. 

Further, AM (g)? = AM (g) = g(AH) — AB(g), where 


AB(g)t = f a(x) vt} x dz;w) (19) 


([250; Chapter I, 2.14]). Hence, assuming that vH ({t} x E;w) = 0 we see that 
AB(g)t = 0, and therefore AM (g)? = g(AH). However, Ag » (u — v”) is also 
equal to g(AH), and the jumps of the purely discontinuous local martingales on 
the right-hand and the left-hand sides of (18) are the same. 

Thus, from (9), in view of (17)-(18), we obtain the following representation: 


H = Ho + B(g) + Ho + g » (u” — v”) + (z — g(a) = u”, (20) 


which is called the canonical representation of the semimartingale H. 
Comparing (20) and (4) in the discrete-time case we see that their main differ- 
ence is the presence of the continuous component H° in (20). 
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6. There exist two predictable terms in (20): B(g) and v4. The third impor- 
tant characteristic of H is its ‘angular brackets’ (H°), which is the (predictable) 
compensator of the continuous locally square integrable martingale H°. 


DEFINITION 5. Let H = (Ht, ¥t)t>0 be a semimartingale and let g = g(x) be a 
truncation function. We set B = B(g), C = (H°), and v =v". 
Then we call the collection 


T = (B,C,v) (21) 


a triplet of predictable characteristics of H. 


It must be noted that the components C and v of T are independent of our 
choice of the truneation function g = g(x). On the other hand B depends on g. 
Moreover, if g and g’ are two distinct truncation functions, then 


Big) — B(g') = (g - 9') * v. 


7. We now discuss several properties of semimartingales that can be stated in terms 
of predictable characteristics B, C, and v. 
a) If H is a semimartingale, then 


(£2 A1) *Vv E€ Aoc, 


` 2 ` + 
i.e., the process oa ^1) dv) is locally integrable. In other words, there 


exists a sequence of Markov times Tn, Tn T œo (P-a.s.) such that 
Ef (x? A 1)dv < x. 
(0,t]xR 


b) The semimartingale H is special (in particular, it is a local martingale) if and 
only if 
(x? A ||) * v € Soc 


c) The semimartingale H is locally square integrable if and only if 
2 
Tl xV E Soc 


The proof of a) is based on the inequality 57 (AHs)? < œ (P-a.s.), t > 0, which 
sxt 
holds for semimartingales; see Remark 3 in Chapter III, § 5b. As regards the proofs 
of properties b) and c), see [250; Chapter II, § 2b]. 
d) If H = Hj +N-+A is a canonical decomposition of a special semimartingale H, 
then 
H=Hjt+ Ho +Xx*(p-—v)t+aA. (22) 
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In other words, for special martingales H we can take g(x) = x in the canonical 
decomposition (20). 

e) We associate (for each 8 € R) the triplet T = (8,C,v) of predictable charac- 
teristics of a semimartingale H with the following predictable process of bounded 
variation: 


P 0? i0 ; 
WA), = i0 B; — zl + Je * — 1 —i0g(x)) v((0,t] x dr;w), (23) 


which is called the cumulant (of H); cf. Chapter HI, § 1b. 
Let 
G(A) = €(¥(4)), (24) 
where &(U(@)) = (E(¥))) 50 is the stochastic exponential constructed from (6) 
(see Chapter II, §5c, Example 1): 


6(¥(8)), = eV [| (1+ AW(O)s)e Os, (25) 
O<s<t 
THEOREM 3. Let AW(6); 4 —1, t > 0. Then the following properties are equiva- 
lent: 
1) A is a semimartingale with characteristics (B,C,v); 
2) for each @ € R the process 


elt 


t>0, (26) 


is a local martingale. 


(As regards the proof based on It6’s formula for semimartingales see [250; Chap- 
ter II, § 2d].) 

f) The most simple processes H = (Ht, ¥t)¢>9 in the class of semimartingales 
are ones with independent increments. Their characteristic feature is that the cor- 
responding triplet T = (B,C,v) is non-random. In other words, B = (B¢)t30, 
C = (Ct)tz0, and the compensator measure v = v(dt,dx) are independent of w (see 
[250; Chapter II, § 4c]). 

Hence the cumulant (6) of such processes is independent of w and if 
AW(@) # -1, which corresponds to the case of a process H = (H;) continuous in 
probability, then for Hp = 0 we obtain from (22) the Lévy—-Khintchine formula 


. 2 . 
Ee Ht — o¥(P): — exp{ 0B; — ae + Jee — 1 — î8g(x)) v((0, t] x az). (27) 


For Lévy processes we have 


B= b-t, Cy =c-t, v(dt, dx) = dt v(dz), 
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and 
Eei He — t¥(0) 


where 
2 : 
(8) = ibb — re + [te — 1 — i09(«)) v(dz). (28) 


(The function 7(6) is called the cumulant along with V(@);.) 
The measure v = v(dz) satisfies the conditions 


v({0}) = 0, (2? A1)*v < œ (29) 


and is called the Lévy measure. (Cf. Chapter III, § 1b.) 

g) We consider now a special case of Lévy processes: ‘a Brownian motion with 
drift and Poisson jumps’. 

More precisely, let 


Nt 
Hi = mt+ow,+ Y Ek, (30) 
k=1 
where W = (W;)is0 is a Wiener process (Brownian motion), €1,€2,... are inde- 


pendent identically distributed random variables with distribution function F(x) = 
P(&1 < x), and N = (N¢)¢50 is the standard Poisson process with parameter À > 0 
(EN; = At). We assume that W, N, and (&, €2,...) are jointly independent. 

The following chain of relation brings us easily to the canonical representation, 
giving us the triplet of predictable characteristics: 


Nt t 
Hy=mt+oW: +S mtw ff edy 
0 


k=1 
m (m+ f| foca) + (ow: +f [oan -») +f [l= a0) au 
=t(m+r fole) F(ae)) + (ow, + [ [au -») +f [e-s aw 


Hence 


Bog): =t(m+A f g) Fae), 
Ci = o?t, 
dv = Adi F (dz). 
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h) Random sequences H = (Hn, ¥n)nzo with discrete time can be fitted in a 
n 
natural way into continuous-time schemes (see Chapter II, § 1f). If Hn = Ho+ >> hk 
k=1 


with hy, = AHg, then the triplet T = (B(g),C,v) has the following structure: 


Big)t = y Elg(ha) | Fei], 


1<¢k<[t] 
Ct = 0, 
dv((0,t]x Ajw) = XO P(hy € Al Fp) 
1<k[t] 


with A € B(R \ {0}). 


§ 3b. Construction of Marginal Measures in Diffusion Models. 
Girsanov’s Theorem 


1. The results of §§2b,c on necessary and sufficient conditions of the absence of 
arbitrage demonstrate that it is important for arbitrage theory to find martingale 
or locally martingale measures equivalent to the original probability measure 

One, fairly common, method of the construction of martingale measures is based 
on Girsanov’s theorem and its various generalizations. Another method, long known 
in actuarial studies, is based on the Esscher transformation (see § 3c below). 

We presented Girsanov’s theorem in its original formulation from [183] in Chap- 
ter III, §3e. In the present section we shall prove it, discuss some generalizations, 
and suggest criteria for the local continuity and the equivalence of probability mea- 
sures corresponding to diffusion and It6 processes. 


2. We consider a process X = (Xt, Ft)tz0, defined on a filtered probability space 
(2, F,(Ft)ez0,P), which is an Jtô process (see Chapter HI, §3d) with differential 


aX; = at(w) dt + dBi, zo = 0, (1) 
where a = (at(w), Ft)tzo is a process satisfying the condition 


p( f lasto)lds <œ) =1, t > 0, (2) 


and B = (Bt, Ft)tzo is a standard Brownian motion. 
Assume now that a = (a4(w), #¢)¢>0 18 another process and 


P( f (ante) - G(w))" ds < o) ae, “ae ®) 
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Then there exists a well-defined process Z = (Zt, Ft)tzo (cf. (21) in Chapter II, § 3d) 


with : i 
= 1 pa 
z= exo | (Gs - 0.) aB,- 5 f (is - a4)? as, (4) 
0 0 


which is a nonnegative local martingale, e.g., for the localizing times 


t 
ne = int: [G.-aPas> eh, k>1. 
0 


By Fatou’s lemma this process is a (nonnegative) supermartingale, and therefore 
by Doob’s convergence theorem (see Chapter III, § 3b) there exists with probability 
one a finite limit lim Z (= Zoo). 

too 
Let 
EZ% =1. (5) 


(This is equivalent to the uniform integrability of the family {Z;, t > 0}.) Then we 
can define another probability measure P on (Q, F) by setting 


dP = Z% dP. (6) 


THEOREM 1 (I. V. Girsanov, [183]). The process B= (Bi, Ft) +30 with 
D t 
Bi = Bi — [ (as — as) ds (7) 


is a standard Brownian motion with respect to the measure P and 
dX; = Gy(w) dt + dBi. (8) 
We present the proof in subsection 3, while here we discuss several consequences 
and observations. 
COROLLARY 1. Let 


t 
X= B-a f as ds, (9) 
0 


where à € R, and let 


t A2 t 
zò = ex(a f as dB, — sf a? is). (10) 
0 2 Jo 


Assume that EZ, = 1 and set dP» = Zà dP. Then the process X = (Xt)eso is a 
standard Brownian motion with respect to the measure Pà, 

If EZA = 1 for some finite T, then X = (X¢)tzo is a standard Brownian motion 
on the interval [0,T] with respect to the measure PA such that dPA = ZÀ dPr, 
where Pr =P | Fr. 
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COROLARY 2. Let tT = T(w) be a finite Markov time and let 


2 
Eexp( AB, = <1) =k (11) 
a A2 
We set dP* = ZÀ dP, where ZÀ = exp (s, -= <7) . Then the process X = (Xt)t30 
with 
Xt = Be- A (tAr?) 
is a standard Brownian motion with respect to the measure PA, 


Remark 1. We formulated Girsanov’s theorem in Chapter III, §3e under the as- 
sumptions that 0 < ¢ < T and EZ; = 1. This is in fact a special case of the current 
setting, where 0 < t < on, for we can set a; = 0 and a; = 0 fort >T. 


Remark 2. We know of some conditions ensuring property (11). These are (see, 
e.g., [288], [303], or [402]): 


Ee2™ < o0 (‘the Novikov condition’) (12) 
and 


1 
sup Ee2?t4t < oo (‘the Kazamaki condition’). (13) 
t>0 


Since sup Ee2 Brat < (Ee37)!/2, condition (13) is looser than (12). 
t20 
If we set, e.g., T = inf{t: By = 1}, then E\/7 = œ, and therefore Ee? = 00. 
Thus, (12) fails in this case. Nevertheless, condition (13) holds for such times, too: 


Eexp(B, — r) = 1. 


If the stopping times 7 are Markov with respect to the flow (FP )i>0 generated 
by a Brownian motion, then we can relax (12) and (13). 


THEOREM 2 ((282]). Let p = y(t) be a nonnegative measurable function such that 


jim (Be — y(t)) =+00 (P-a.s.), (14) 


and let r be a Markov time with respect to the flow (FP )tz0- 
Then each condition, 


1 
lim sup Eexp{ 5(r ANo)— (T^ o)} < œ (15) 
N- oo N 2 
ocMg 
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or 1 
lim sup Eexp{ 5 Brno — y(t A o)} < œ, (16) 


00 oem 2 


where MY is the class of Markov times o (with respect to (FP )t>0) such that 
P(0 < o < N) =1, is sufficient for the equality 


Eexp{ B; = sr} eh. (17) 


Remark 3. For Zò = (Z))t>0 defined by (10) the corresponding ‘Novikov and 
Kazamaki conditions’ can be stated as follows: 


2 fo 
Eel > | a? as} < 0, (18) 
0 


À oO 
Eexp{ > | as dB, } < 00. (19) 
0 


As regards the proofs and extensions to processes Z = (Z;)¢50 with 
1 
Z: = exp{ Li - 5(L,L)e}, (20) 


where L = (Lt)¢>0 is a continuous local martingale (eg. Li = faso) dBs,t > 0), 
see [288], [303], or [402]. 


3. Proof of Girsanov’s theorem. As in the discrete-time case (see Chapter V, 
§ 3b), it suffices to verify that (P-a.s.) 


eas =. 2 
Eş [eB Bs) | F.) = e7 T (ts) (21) 


for 0< s < tand ĝ ER. ; 
To this end we set as = @s — Gs, Bi = Bi - f aæsds, and 


t 1 ft 
P= exp( | as dB, — a a? is) 
0 0 
(see (7) and (10)). 


By Bayes’s formula (Chapter V, § 3a), 


ES [cit Br- Bs) | Fs] = ze [Ze (B-B) | Fs], (22) 
sS 


g2 
and we claim that the right-hand side of (22) is equal to e7 7079., 
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For simplicity we shall consider the case of s = 0. 
Let U; = et. Then by Itô’s formula (Chapter III, § 5c) we see that 


g2 
d(ZtUt) = ZU (ae + 10) dBi — “Ui dt, 


zits =1+ f ZsUs(&œs +10) dBs — Ef ZsUs dt. 


Hence, using similar arguments to the proof of Lévy’s theorem (Chapter III, § 5a), 
we obtain the following equation for Ep ZU: 


82 t 
EpZU: Ss T7 | EpZ:Us ds, 
J0 
from which we conclude that 
Excite — Ep zit, = e T 
P = Epy = e E 


Formula (21) can be verified in a similar way, which proves that the process 
B = (Bato defined in (7) is a standard Brownian motion. Relation (8) is a 
consequence of (1) and (7). This complete the proof of the theorem. 
4. Hence if the process X has the differential dX; = a;(w) dt + dB; with respect to 
the measure P, then its differential with respect to the measure P defined in (6) is 
dX; = a¢(w) dt + dB, where B = (Bi)ts0 i is a Brownian motion with respect to P. 

It should be noted that if all our considerations proceed for a time interval [0, T], 
where T can be also a Markov time, then we can replace the condition EZ. = 1 
by a weaker one, EZp = 1. 


Remark 4. Assume in Girsanov’s theorem that a; = 0, i.e., 


= t 
B= Bet f asds, t<T 
0 


t 1 t 
Za =e- | asdBs - 5 f eas}. 
0 2 Jo 


If EZp = 1, then the process 


t 
X= B+ f as as 
0 


coincides with Bı and is a Brownian motion with respect to the measure Pr such 
that dPr = ZrdP (cf. Chapter V, § 3b), therefore it is a martingale measure. 


and 
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5. Let X be an Itô process on the (filtered) space (C, ©, (€;)¢>0) of continuous 
functions w = (wt)tpo with differential (1) and let „* = Law(X |P) be the proba- 
bility distribution of this process. 

Girsanov’s theorem is a convenient tool in answering the question when the 
restrictions Ee = p* | 6; of the measures u* are absolutely continuous, equivalent, 
or singular with respect to the measures u8 = u? |e. 

The measure u? is just the Wiener measure (Chapter III, §3a), so that we are 
discussing the properties of the measure u* corresponding to the process X in its 
relations to the Wiener measure p?. If uč K pe or pe < pe. then one would 


. ~ du 
also like to have ‘explicit’ expressions for the Radon-Nikodym derivatives a 
du dut 
and SEE, 
dys 


These issues are studied in much detail in (303; Chapter 7] for the case of Itô 
processes and in [250; Chapters II-V] for semimartingales (through the use of the 
Hellinger distance and Hellinger processes). For that reason we restrict ourselves 
to several results. 

We consider some time interval [0, T]. 

Assume that 


T 
e(/ a*(w) ds < o) = (23) 
0 
Then there exists a well-defined process Z = (Zt)rc7 with 
t 1 t 
A= ex(- | as(w) dBs — sf 2w)ds), (24) 
0 0 


THEOREM 3. IfEZp = 1, then 
Mp ~ ep (25) 
and 
du? os Lf x 
EEX) =E(exp|— | asw)dXs+5 | af(w)ds| | FE) (w), (26) 
dys 0 2 Jo 
where GA =o(w: Xs(w), s < T). 
Proof. We define a measure Pr by setting dPr = ZrdPr (cf. (6)) 


P| Fr. The process X = (Xz)¢<7 is a Brownian motion with respect to Py by 
Girsanov’s theorem, and therefore 


where Py = 


? 


uB(A) = Pr(X € A) = | Zr (w) P(dw) 
{w: X(w)EA} 


x 
7 X(w)€A} E [Zr | Fr ] (w) P(dw) 


= E(I4(X(w)) -E| Zr | F#](w)). (27) 
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Let ®r (zx) be a Er-measurable functional such that E[Zr | F] (w) = p(X (w)). 
Then from (27), using the formula of the change of variables in a Lebesgue integral 
(see, e.g., [439; Chapter II, §6]) we deduce the equality 


E I ör(X(w))P(dw) = I r(z) pX (de), 
{w: X(w)EA} A 
and therefore ue < pž. In addition, 


dup T) = zx 
duž | ) = r(x) (28) 
and 
B 
SE) = E[Zr | FŽ (w). (29) 
dpa 


We claim now that u% < u. To prove this we observe that Pp(Zp(w) > 0) = 1, 
and therefore Pyr < Pp (cf. Chapter V, § 3a) and 


dPr 


2 (w) = Z7 (w). 
Pr Zr (w) (30) 


Hence 
uÝ (A) = Pr(X (w) € A) = Ep, (T4(X (w)) Zp") 
= Ep, (J4(X))E6, (ZPF) U) = f roža), (81) 
where r(x) is a @p-measurable functional such that 
Ep (Zr 1FF) ©) = Or(X(w)). 


From (31) we see that u% < uË and 


X 
p (X(w)) = Ep, [Z7 | Fe] (¥). (32) 


6. We consider now the special case when an It6 process X is a diffusion-type 
process, ì.e., let az(w) = A(t, X(w)) in (1), where A(t, x) is a nonanticipating func- 
tional (A(¢,x) is measurable in (t,x) and for each fixed t the functional A(t,z) is 
,-measurable in z). 
In this case, for 
dX, = A(t, X) dt + dB, (33) 


we can deduce from Theorem 3 the following result. 
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THEOREM 4. Let 


T 
e(/ A? (t, X)dt < o) =1 (34) 
0 
and let 
T 1 rt 
cexp(- / A(t, X) dB, - sf A2(t, X) ar) =i (35) 
0 0 
or, equivalently, 
T 1 rt 
Eexp(- f At, x)axe+ 5 | A(t, X)at) =1. (36) 
0 0 
Then n ~ uB, 
du? r 1 fT 2 
SE (x) = exp - | A(x) axe+ f A (Xd), (37) 
dua 0 2 Jo 
and 
du% £ Lig? 2 
—+ (X) = exp A(t, X) dX; — = A (t, X) dt : (38) 
dP 0 2 Jo 


Remark 5. If we give up the condition EZp = 1 in Theorem 3, then we can prove 
(see [303; Theorem 7.4]) that 


T 
P( | A,X) dt <00) =1 => pf «pb (39) 
0 


and 


T 
P(/ A(t, X)dt <0) =1 
A A D. 
e(/ AP(t,B)dt <œ) =1 
0 


7. Comparing Theorems 3 and 4 we see that while we have ‘explicit’ formulae (37) 
and (38) for diffusion-type processes, the corresponding formulas for It6 processes 
(see (26)) require the calculation of the conditional expectation E(-| 4%). The 
following result, which is also of independent interest, is useful in the search of 
‘explicit’ formulas for the Radon—Nikodym derivatives also in the case of It6 pro- 
cesses since it allows one to ‘transfer’ the conditional expectation in (26) under the 
integral sign. 
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THEOREM 5. Let X be an Itô process with differential (1), where 


T 
f Elas(w)|ds < œ. 
0 
Let A(t, x) be a nonanticipating functional such that 
A(t, X (w)) = E(ae| FË )(w). 
Then the process B = (Bt)tcr with 

af t 

Bi = Xt -f A(s, X(w))ds 
0 


is a Brownian motion (with respect to the flow (FË Jit). 
If, in addition to (41), 


then he ~ pË ; 


p( | 40X < o) =], 
p( [ are. B)a < o0) =1 


and 
d X T 
“HE (B) = exo f A(s, B) dB, - = A°(s, B) ds), 
dup 0 0 
d X T 
AE =ef | A(s, X) dX al AX%(s,X) ds) 
dur 0 0 


Proof. The fact that the process B = (Biter: which in the case of GB 


(42) 


(43) 


(46) 


(47) 


(48) 


(49) 


Fie 


(t < T) is called an innovation process, is a Brownian motion, can be easily proved 
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on the basis of Itô’s formula for semimartingales (Chapter III, §3d) as applied to 
e'\(Bt-Bs) for t >sandX ER. In fact, 


2, ES t = = 
ciMBr-Bs) — 4 +i f oi(Bu-Bs) gp, 


s 


. t eB = r 
+f e fau (w) A(u, X(w))] du 


2 Gt) see > Dees 

-— e(Bu-Bs) du. (50) 
2 Js 

Since 
Eruca, TE 
e(/ gA Ea | sx) =0 
SS 
and 


: if’ e'MBu-Bs) (a,,(w) — A(u, X (w))) du | zr 


Bs tye Ge Se 
5 el eò Bu BSE (au (w) - A(u, X(w)) | FX) du | sz] =0, 
we see from (50) that (P-a.s.) 


eee 2 t vn S 
(ee ess) | FX) =1~— =f E(eiM(Bu- Bs) IFZ) du, 
8 


and therefore (P-a.s.) 


a a _ 2 
E(ciM(Be~Bs) | FX) = ers) Os <t, (51) 


so that B is a Brownian motion. (The trajectories of (Biter are continuous 
by (43).) 
To prove the remaining assertions it suffices to observe that 


du% _ duf du? dup L] 
du} du Ro dup) dupe 


and to use Theorems 3 and 4. 
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8. Theorems 1-5 can be generalized (see (303; Chapter 7] for greater detail) to 
multivariate processes X or to the case where (in place of diffusion coefficient 1) we 
allow the diffusion coefficients in (1) and (33) to depend on X and t. 

We present the following result obtained in this direction. 

Let X = (X;)tcr be a diffusion-type process with 


dX, = a(t, X) dt + p(t, X) dB, (52) 


where a(t, x) and (t,x) > 0 are nonanticipating functionals, the stochastic integral 
t 
| B(s, X)dBs 
0 


T 
is well defined for t < T, and h la(s,X)|ds < co (P-a.s.). 


T 
Let à(t, x) be another nonanticipating functional with I |a(s,X)|ds < co (P- 
a.s.) such that 


ees o 
We set 
nead f e a h a eh 0 


IfEZp = 1 and P is the measure such that dPr = Zr dP, then X is a diffusion- 
type process with respect to Py and its differential is 


dX, = a(t, X) dt + B(t, X) dB, (55) 
where B is a Brownian motion (with respect to Pr). 
EXAMPLE. Let X; = e™t+0 Bt. Then 


2 
dX; = X;|(m-+ T) at + oath), (56) 
o? 
ie., a(t, X) = (m + ox and f(t, X) = oX. We set a(t, X) = 0. By (54), 
m o l/m ay? 
Z= exp (T+ 5) B et t}, (37) 


and if a measure has the differential dP-p = Zr dP, then the process X = (Xilt<T 
has the stochastic differential 


dX, = oX; dB; 
with respect to Pr. In other words, the process X = (Xt)i<r is a standard geo- 
metric Brownian motion with respect to Py (see Chapter III, § 3a): 


pe o2 
Xt= exp ob; = Fe} (58) 


3. Semimartingales and Martingale Measures 683 


§ 3c. Construction of Martingale Measures for Lévy Processes. 
Esscher Transformation 


1. If X = (Xt)¢cr7 is a process of diffusion type with respect to the original measure 
Pr, with local characteristics a(t, X) and @(t,X) (see formula (52) in § 2c), then 
Girsanov’s theorem suggests an explicit construction of another measure Pr such 
that X has local characteristics a(t, X) and A(t, X) with respect to Pr. If now 
a(t, X) = 0, then X is a local martingale with respect to Pr; one calls Pr a locally 
martingale measure for that reason. 
The construction of this measure proceeds by the formula 


dPr = ZrdPr, (1) 


where Zr is defined by equality (54) in the preceding section. 

The Esscher transformation suggests another construction of a new measure, 
which is essentially based on the same idea. Namely, assume that the initial process 
X = (Xt)icr has independent increments (e.g., is representable as the sum of 
independent random variables). 


2. Recall that we encountered already the Esscher transformation and its general- 
ization (‘the conditional Esscher transformation’) in Chapter V, § 2d (see, in par- 
ticular, the remark in subsection 2). It should also be noted that, as a the method 
of the construction of ‘risk-neutral’ probability measures assigning ‘larger weights 
to adverse events’ and ‘smaller weights to beneficial events’ the Esscher transfor- 
mation is known in actuarial practices since 1932, when F. Esscher’s paper [144] 
was published. 

For instance, insurance companies are based in their calculations of life-insurance 
premiums not on the (quite precisely known) distribution Py of life expectancy 
(‘mortality tables of the second kind’), but on another, different, distribution Pr 
(‘mortality tables of the first kind’) that has the above property of shifting the 
balance between beneficiary and adverse events. 

3. Before a discussion of the Esscher transformation in the general case we consider 
the following simple example (cf. Chapter V, §2d), which is a good illustration of 
the true meaning of this transformation. 

Let X be a real-valued random variable with Laplace transform (\) = 
Ee** < oo, A € R, and let P = P(dz) be its probability distribution on (R, @(R)). 

We consider a family of probability measures P(@), a € R, defined by means of 
the Esscher transformation: 


P(*) (dr) = iw P(dz). (2) 


Setting 
ZO (x) = 


ae j’ (3) 
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we see that Z) (x) > 0, EZ(@)(X) = 1, the measure P() is equivalent to P, and 
P) (dr) = Z® (x) P(dz). (4) 
It is also clear that 


Ee(Ata) Xx D(a + A) 
(a) = AX z = 
® (A) Epraye (a) (a) , (5) 


and therefore 
aaa) (A) _ (a) 7 
TENO k 
We showed in Chapter V, §2d that if a random variable X has the properties 
P(X > 0) > 0 and P(X < 0) > 0, then the function (a) attains its maximum at 
some point a, where, obviously, ®’(@) = 0. 
Hence we obtain the expectation EX = EpaX = 0 with respect to the mea- 


Epa) X = 


sure P = P(@), which one sometimes expresses by saying that P is a ‘risk-neutral’ 
probability measure. 

We can consider the property EX = 0 also as a ‘single-step’ version of the 
martingale property, which explains why one also calls P a martingale measure. 


4. Now let X = (Xi)tcr be a Lévy process on (Q, Fr,Pr) with characteristic 
function (see (27) in §3a here and Chapter III, § 1b) 


Eet Xt = eH) (7) 


where the cumulant is 
6? ; 
(8) = ib — ae + as — 1 — i0g(x)) v(dx) (8) 


and g(x) is a truncation function (e.g., g(x) = zI (e| < 1)). 
From (7) and (8), setting formally @ = —iX we see that 


bee e20), (9) 


where 
2 


p(A) = àb + Xe + fe — 1 — àg (£)) v(de). (10) 


The easiest way to a rigorous proof of the representation (9)-(10) for the Laplace 
transform is based on the following observation: the process 


AÀ 
Z = (z ish 
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with o 
Z = exp{àÀX; — tp(A)} (11) 
is a martingale. (This is an immediate consequence of It6’s formula for sernimartin- 


gales. Of course, one must also assume that the integral in (10) is finite.) 
By analogy with (2) we introduce now for each a € R the probability measure 


pie ) defined by means of the Esscher transformation: 


dP = Z apy. (12) 


THEOREM 1. The process X = (Xt)tc7 is also a Lévy process with respect to pla) 
a € R, which has the Laplace transform 


(a) 
Epe = eP 0, (13) 
where 
p® (A) = g(a +A) — pla). (14) 


Proof. This is a consequence of Bayes’s formula (Chapter V, §3d), which states 
(a)_ 
that (Pp -a.s.) 


A(X:-X X-X) Z 
AKK |$,) = Ep (A | Fs) 
S 
a E(e(atr(Xe— Xs)—v(a)(t—s) | Fs) 
= Eele +A) (Xt-Xs)-p(a)(t—s) 
= e(P(atr)—(a))(t-s) 
Hence if the measure pi) is defined by the Esscher transformation (12), then 


X = (Xt)rer is a Lévy process also with respect to this measure and its Laplace 
transform is defined by (13) and (14). (Cf. Girsanov’s theorem in § 3b.) 


THEOREM 2. The local characteristics (b) , cle), v\@)) of the process X = (Xticr 


with respect to the measure pi) a € R, can be determined from the local charac- 
teristics (b,c,v) by the following formulas (where g(x) is a truncation function): 


b) = b+ act i g(z)(e°" — 1) v(dz), (15) 
R 
c = c, (16) 


v9) (dx) = e°? v(dz). (17) 
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Proof. By Theorem 1 the process X is a Lévy process with respect to pia) and 
A? ` 
PDA) =A + Sol) + f (eè? — 1 — Ag(x)) v® (dr). (18) 
R 


Bearing in mind that y(® (A) = g(a + A) — (A), A € R, we immediately obtain 
‘transition formulas’ (15)—(17) by (18) and (10). 


5. Let X = (Xt)rcr be a Lévy process (with respect to the measure Pr). Since it 
is a semimartingale, this process has a (non-unique, in general) canonical decom- 
position X = M +A, where M is a local martingale and A is a process of bounded 
variation. We consider now the following question: what conditions on the local 
characteristics (b,c,v) make X a local martingale (with respect to Pr). 

We can express this otherwise: we are interested in the conditions making Pp a 
martingale measure for the process X. 

First we must point out that if X is a local martingale, then it is a special 
semimartingale, and therefore, with necessity, 


(a? A |z|) xv E Aoc (19) 


(see (20) in § 3a). 
Let X be a special semimartingale, let 


X=N+A (20) 


be its canonical decomposition (where N is a local martingale and A is a predictable 
process of bounded variation), and let 


X=B+X°S+g%(u-v)+(c—-g) xu (21) 

be its canonical representation, which (in view of (19)) can be rewritten as follows: 

X=B+ XS +gx(u-v)t+(c—g)*(u-v) + (eg) *v. (22) 
Comparing (20) and (22) we see that 

A=B+(a-g)*yp, (23) 


and therefore we can say that a special semimartingale X with triplet of predictable 
characteristics (B,C,v) is a local martingale if and only if 


B+(e-g)xv=0. (24) 
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All this shows that a Lévy process X with triplet of local characteristics (b,c, v) is 
a special semimartingale if and only if 


fe A |z|) v(dz) < œ (25) 


(see (20) in § 3a), and it is, in addition, a local martingale if and only if 


b+ fe ~ g(x)) v(dz) = 0. (26) 


If g(x) = zI (|z| < 1), then condition (26) assumes the following form: 
b+ J z v(dz) = 0. (27) 
lz]>1 


Of course, it may happen that (26) breaks down with respect to the initial 
(a) 


measure P. Then one could consider the measures P” constructed by means of 


the Esscher transformation. Since the ‘new’ local characteristics WW, cl), v(2)) can 
be defined by (15)-(17), it follows by (25) and (27) that a Lévy process is a local 


martingale with respect to the measure pia) if and only if 


la A |x|)e*” v(dz) < œ (28) 


and 
b+ac +f xv(dr) + 1 x(e°? — 1) v(dx) = 0. (29) 
jx|>1 R 


EXAMPLE 1. Let X; = mt + 0B; + kN, where B is a standard Brownian motion 
and N is a standard Poisson process with parameter v > 0 (EN; = vt). We now 
find a value of a such that X = (Xz)¢<7 is a local martingale with respect to the 


measure pe j 
We represent X; as follows: 


Xt = (m + kv)t +o Be + k(N; — vt), (30) 


and see that it is certainly a martingale with respect to the original measure, pro- 
vided that 
m+kv =0. (31) 


Assume now that m + kv # 0. Then the process (kN:)tp0o makes jumps of 
amplitude k, the Lévy measure is v(dr) =v - Trp} (dz), 


Ee MeN) — ertln (32) 
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and the local characteristics b and c of X (for the truncation function g(x) = 
xI (|x| < k)) have the following form: 


b= m + kv, 


c=. 


By (29) we obtain the following equations for a (in view of our choice of the 
truncation function g(x) = sI (|z| < k), integration over the set {x: |e] > 1} is 
replaced by integration over {x: £ > k}): 


(m + kv) + ao? + vk(e%* — 1) = 0. (33) 


If o? Æ 0, then this equation has a root a, and therefore X becomes a martingale 


with respect to the measure pia). On the other hand, if ø? = 0, then the root @ 
can be found from the equation 


k m 
eo" = —— 
kv’ 


(34) 


which is soluble if m is not zero and of distinct sign from k. 


6. We assume now that the price process S = (St)i<r is generated by some Lévy 
process X = (X¢)ter: 
Si = e*t. (35) 


Below we consider that following question, which is important for the problem 
of the absence of arbitrage: is the process S a martingale with respect to the 
initial measure Pp or some measure pia ) constructed by means of the Esscher 
transformation? 

As in subsection 3, we start from a simple example. 

Let X be a real-valued random variable and let (a) = Ee?*. (We assume here 
that (a) < œ, a € R.) Clearly, if (1) = 1, then the random variable S = e* has 
the ‘martingale’ property ES = 1 (with respect to the original measure). 


If (1) £1, then we can look for a such that this ‘martingale’ property Ep S=1 


holds with respect to the measure pia ) defined by the formula (2). 


Since 
el(atl)x B(a+ 1) 
Pla) — Pla)’ 


Epia S =E 
the value of a must be a root of the equation 


B(a +1) — B(a) =0. (37) 
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For instance, if X is a normally distributed random variable with parameters m 
and o°, then 


(ac)? 
D(a) = Eet” = emt, (38) 
and we find from (37) that 
= 1 nm 
eS Boge (33) 


We consider now the process S = (St)e<r defined in (35). The first question 
to ask here is about the conditions ensuring that this process is a martingale with 
respect to the original measure Pr. 


THEOREM 3. In order that the process S = e* be a martingale with respect to Pp 
it is sufficient (and also necessary) that 


J e7 v(dr) < œ (40) 
|z|>1 


and i 
b+ 5c+ (e7 — 1- g(2)) xv =0. (41) 


Proof. Condition (40) together with the inequality (£? A 1) * v < oo ensures that 
the integral (e7 — 1 — g(x)) » v is finite. 
Clearly, the expectation 
E(e*t Xs | Fy) = EeXt—Xe = elt-s)o()), 
where ¢(1) is as in (10), is precisely the expression of the left-hand side of (41). 


Hence 
E(e** | Fs) = e**, 


which proves that S = e* is a martingale. (The necessity of (40) and (41) is shown 
in [250; Chapter X, § 2al.) 


7. Assume that (41) fails. Then by Theorem 1, 


E (eXt-Xs | Fs) = e(t—s)(e(at+1)—(a)) (42) 


pie) 
Hence it is clear that if @ is a root of the equation 
pla +1) — g(a) = 0, (43) 


) 


then the process S = (S;)+<7 is a martingale with respect to the measure pie ; 
By (43), in view of (18), we obtain the following result. 
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THEOREM 4. Assume that ŭ satisfies the inequality 
let (e? — 1) ~ g(£)| * v < 00 
and 


T! Š 
b+ (G+ 5 )e+ (Fe — 1) - g(x) sv =0. (44) 
Then the process S = (Si)i<r is a martingale with respect to the measure pia) 


EXAMPLE 2. Let S; = ext where X = (X¢)¢>0 is the process introduced in Ex- 
ample 1. Then equation (44) takes the following form: 


2 
(m + $) + ao? + vfer* (ek —1)| =0. (45) 
If o = 0 and k 40, then (45) becomes the equation 
eto ae (46) 


n(ef — 1)’ 
which certainly has a solution if m 4 0 and k and m have distinct signs. 
If k = 0, then 
S, = e™tto Bi (47) 


is a standard geometric Brownian motion (Chapter III, § 4b). By Itô’s formula 
2 
oO 


dS, = S,((m+ 5 


) dt + o dBi). (48) 


1 
From (45) or (39) we find that @ = <a and 


2 
Since j 
x t 
Epe” t = expf SIA +I}, 
it follows that 7 3 
a t t 
Lw P = 4(-, >) 


This means that the process (X;);<7 has the same distribution with respect to 


the measure pi) as (ow: — Zt), mi where W = (Wi)tgr is a standard Wiener 
€ ` 


process (cf. the example at the end of § 3b), so that the process S = (S;);<7 becomes 
a standard geometric Brownian motion. 

Hence both constructions of a martingale measure based on the Girsanov trans- 
formation or the Esscher transformation bring us to the same results. (This is little 
surprise though, for the martingale measure is unique in this case and X; = mt+oB; 
is simultaneously a diffusion process and a process with independent increments.) 


Remark. As regards the Esscher transformation and its applications to option pric- 
ing, see H. U. Gerber and E. S. W. Shiu [178], [179]. 
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$ 3d. Predictable Criteria of the Martingale Property of Prices. I 


1. We devote this and the next sections to predictable criteria (i.e., criteria in terms 
of the triplets of predictable characteristics of a process in question) ensuring that 
the prices, the semimartingales S = (S¢)¢>0, are martingales or local martingales 
(with respect to the initial measure P or some measure P< P; cf. Chapter V, §3f 
for discrete time). 

We start with the observation that different representations of prices can be 
convenient in different problems. 

If S = (St: Ft)tzo is a semimartingale, then, by definition, 


St = So + at + me, (1) 


where a = (at, Ft)tzo is a process of bounded variation and mm = (mt, Ft)tzo is 
a local martingale. This decomposition is not uniquely defined. For instance, if 
S = (S:)tzo is a standard Poisson process (Sọ = 0 and ES; = At), then we can set 
either ag = St, me = 0 or ag = At, me = St — At in (1). 

Expansion (1) is ‘additive’. However, if S = (S;)¢>0 is a special positive semi- 
martingale and (1) is its decomposition with positive process a = (at)t>0, then, 
provided that S+- + Aa; Æ 0, S has the multiplicative decomposition 


St = Sol (hE (M)t, (2) 
where i 
Elg) = e219 TT (1+ Ags)e7 49 (3) 
O<s<t 


is the stochastic exponential and the processes @ = (@:) and m = (mz) can be 
defined by the formulas 


t t 
a day PR diny 
= and = : 4 
rk [ Su- 7 ie 0 Su- + Aau ( ) 


Formula (2) can be easily established on the basis of Itô’s formula; see the details 
in [304; Chapter 2, § 5]. 

The product of two stochastic exponentials in (2) can be, by Yor’s formula 
(see (18) in Chapter III, §3f), rewritten as a single exponential: 


EEM) = EA), (5) 
where £, 
At = Gt + me + l, Mile (6) 
and 
[aml = So AG, Afs. (7) 


O<s<t 
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Hence we obtain the following representation for (5¢)>0: 
St = So6(A)t, (8) 


which is very useful in the analysis of this process ‘for the martingale property’, 
because the stochastic exponential 6(A) i is a local martingale if and only if Hisa 
local martingale. 

We have already mentioned (Chapter II, § la) that, from the standpoint of 
statistical analysis, in place of (8) one could more conveniently use a representation 
of ‘compound interest’ type 


St = Spe! (9) 


with some semimartingale H = (Hz):>0- The representation (9) is usually chosen 
in financial mathematics as the starting point, while the transition from (9) to (8) 
proceeds by the formula 


A, = Hy, += Z(H} + XO (e^ — 1- AH), (10) 
O<s<t 


which can be written in the ‘difference-differential’ form as follows: 


~ 1 
df; = dH; + 5d )e + (e^t —1— AH). (11) 
To prove (10) we observe that, by Itô’s formula for f (H) = e we obtain 
1 
dS, = Si- 4H; + 54(H°) + (eAHe 1 AH;)|. (12) 


On the other hand, by (8) and the properties of stochastic exponential, 
dS; = Si dhr. (13) 


Comparing (12) and (13) we arrive to formula (10). 


Remark 1. The infinite sum in (10) is absolutely convergent (P-a.s.) since for each 
semimartingale H there exists (P-a.s.) only finitely many instants s < t such that 


1 
JAHs|>5 and $O (AHs)? <œ (P-a.s.); 
O0<s<t 


see Chapter II, §5b. For the same reason the infinite product in the definition of 
the stochastic exponential (3) is also absolutely convergent. 
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2. Let H be a semimartingale and let 
H = Ho + B + H° + g» (u — v) + (£ — g(£)) * u (14) 


be its canonical representation (with respect to some truncation function g = g(x); 
here u = u” is the jump measure of H and v = v” is its compensator; see § 3a). 


By (10) and (14) we obtain the following representation for H: 


< 1 
H =H + 3(H°) + (e — 1-2) xp 


1 ; 
= Ho + B + H° + (H°) +g » (u — v) + (x — g(2)) * u + (e7 = 1 — x) » u. 
(15) 
To transform the right-hand side of (15) we use the fact that |W | * u € At, if 
|W|*v E€ ot. aud, moreover, 


Wx (u—v) =Wep-Wev (16) 
(see § 3a). 
Heuce we see from (15) that if 
(\x|Z(\e| < 1) + e(l] > 1))*ve oft, (17) 
then 
H = K + Ho + H° + (e — 1) x (u-v), (18) 


where H° + (e? = 1) * (u = v) € May (P) and 
1 pre z 
K = B + 5(H*) +(e? — 1 — g(2)) * v. 


Thus, the following result is a consequence of (18). 


THEOREM. Assume that condition (17) is satisfied. Then H € M,,(P) and S € 
Mig. (P) if and only if 
K,=0 (P-as.), t>0. (19) 


In that case the local martingale Ê has the representation 


H = Ho + H° + (e® — 1) * (u-v). (20) 
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EXAMPLE. Let H be a Lévy process with triplet (B,C,v) of the following form: 


Bi =bt, Cy=o7t,  v(dt,dr) = dt F(dz), (21) 
where F = F (dx) is a measure such that F({0}) = 0 and 
J («2 A 1) F(dz) < oo. (22) 


We can assume also a stronger version of (22): 
[ (eine <1) + e71(|e| > 1)) F(dz) < 0. (23) 


Under this assumption the price process S; = Soe” is a martingale (with respect 
to the initial measure P) if (b,0?, F) satisfies the following relation: 


o2 
b+ T+ f(e- 1- gle) Fldr) = 0. (24) 


; ; , S 
If By = Boe™ is a bank account, then the discounted price process 3 
(#) is a martingale with respect to P if 
t20 


2 
b+ T+ [(@-1- s0) Fid) =r (25) 
Remark 2. In accordance with notation (10) in the preceding section the left-hand 
side of (24) and (25) is the value of the ‘cumulant’ function y(A) for A = 1. Hence 
formula (25) can be put in the following form: 


p(l) =r. (26) 
Remark 3. As Theorem 3 in § 3c shows, the condition fiele] < 1) F (dz) < œ is 


redundant for Lévy processes. (This condition is a result of the ‘regrouping’ in (15) 
based on formula (16).) 


§3e. Predictable Criteria of the Martingale Property of Prices. II 


1. Without assumption (19) (see the theorem in § 3d) the price process S = Soe” 
is not a local martingale with respect to the initial measure P. 
However, it is sufficient in many cases (e.g., in the problem of the absence of 


arbitrage; see § 2b) that there exists some measure such that P z P or P'S P and 
S is a local martingale with respect to P. 

We have thoroughly discussed the question of the existence of such measures in 
models with discrete time (Chapter V, § §3a-3f). 

Below, we consider this question in the continuous-time case for semimartingale 
models. 
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2. Let (Q, F, (Ft)ex0, P) be a stochastic basis and let P; = P | F; be the restriction 


of P to Fe. Assume also that Pisa probability measure on ¥ such that P S P, 
ie. P: < P; for all t > 0. We also assume that Fo = {Ø, 9} and Po = = Po. 

Our analysis of the existence of measures P making one or another process a lo- 
cal martingale starts with Girsanov’s theorem for local martingales, whìch demon- 
strates what happens to local martingales undergoing an absolutely continuous 
change of measure. 


~ | 
THEOREM 1. Assume that P < P, let M € Mioc(P) with Mo = 0, and let 
dP: 


Z = (Zt)t>0, where Z = aP, Assume that the quadratic covariance [M, Z| 
dP 


has a P-locally integrable variation and let (M,Z) be the predictable quadratic 
covariance (the compensator of [M, Z}). 
Then the process 


~ 1 
M =M - 5: (M,Z), (1) 
Z 
is a local P-martingale and the P-characteristic (Me, M°) is the same (P-a.s.) as 
the P-characteristic (M°, M°). 
Proof. In accordance with the lemma in Chapter V, §3d, 
XZE M = XEM). (2) 


(We have stated and proved this lemma in the discrete-time case; this can be 
trivially extended to continuous time.) 

From (2) we can easily derive the following local versions of this equivalence 
(see [250; Chapter III, § 3b] for the details): 


XZ € My(P) => X € Mo,(P); 
(XZ)P € MioclP) => X € Mioc(P), (4 


~ 
w 
aS a 


where (XZ)! = (XtaTnZtaTn )t20 and Tn = inf(t: Ze < 1/n). 

Thus, to prove that Me My,(P), it is sufficient to verify that (MZ)T € 
Mioc(P) n 2 1. 

Let 


1 
A= z: (M,Z). (5) 
Then, by Itô’s formula, 
(M—A)Z=MZ-AZ 
=(M_-Z+Z_-M+4+[M,Z])-(A-Z4+Z_-A) 
= (M_-Z+2Z_-M+([M,Z|—(M,Z))) +(M,Z)-A-Z—-—(M,Z) 
=M_-Z4+2Z.-M+((M,Z|-—(M,Z))-—A-Z. (6) 
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The first three terms on the right-hand side of (6) are local P-martingales. The 
same can be said for each n > 1 about the process (A-Z)"". Hence, by assertion (4) 
the process M belongs to Mge(P). 

Thus, M is a semimartingale with respect to the measure P, and it has the 
canonical decomposition 

M=M+A. (7) 
Hence by the definition of quadratic variations [M, M] and [M, M] and consider- 
ing the limits as n + > of the Riemann sequences S™®)(M, M) and SM, M) 
(see formula (10) in Chapter III, §5b) we obtain that [M, M] = [M.M] up to 
P- indistinguishability. Finally, in view of formula (22) in the same section, § 5b, of 
Chapter III, we conclude that the predictable quadratic variations (M°, M°) and 
(ME, M°) coincide (a.s. with respect to the measure P). 


~ loc 
3. Let S; = Soe, where H = (H¢)¢>0 is a semimartingale, assume that P < P, 


and let Z = 

Assume that the process Z = (Z;)z>0 is generated by some P-local martingale 
N = (Nijtz0: 

dZ, = Zi- dN;. (8) 

That is, let Z = &(N). 

We represent S as the product S = So (H), where H can be found from H by 
the formula (10) in the preceding section. 

Let H be a special semimartingale with canonical decomposition 


H=-H)+A+M, (9) 


where M € Mioc(P) and Aisa predictable process of locally bounded variation. 
We represent H as follows: 


H=Hj)+A+M=H)+A+(M,N)+ (M ~(M,N)), (10) 


and observe that 
zo (M,Z) = (M,N). (11) 


Then, by Theorem 1, the process M- (M, N) is a local martingale with respect to 
the measure P with dP, = = Z dP, t > 0. 
Hence we obtain the following result, 


THEOREM 2. If H is a special semimartingale with canonical decomposition (9), 


~ | 
P < P, and the process Z = (Z;)¢>0 has the representation (8), then 


A+(M,N)=0 => H€.M,,(P). (12) 
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4. Assume that condition (17) in the preceding section is satisficd and the pro- 
cess N has the following representation: 


N= 6 H° + (Y = 1)% (u ~v) (13) 
where 8 = (((w))¢>0 is a predictable process and Y = Y (t, w, £), is a -measurable 
function of t > 0, w € Q, and z € R. (Here u = u”, v= vh, and we assume that 
the corresponding integrals with respect to H° and p—v are well defined.) 

We use the representation (18) in § 3d for H: 
Ĥ = Ho + K + H° +4 (e? — 1) * (u — v). (14) 
If v({t} x dz; w) = 0, then we see that 
(M,N) = 8- (H°) + (Y — 1)(e7- 1) *v (15) 


(see the observation at the end of § 3a.4). By (12), (14), (15), and also relation (19) 
in §3d we obtain the following result. 


THEOREM 3. Assume that the conditions (17) in §3d, v({t} x dr;w) = 0, and 
[Y — 1| je? — 1| * m <œ (16) 


are satisfied. If, in addition, 
1 
B+ (5 +.B) (H°) +(e? — 1- g(a) xv + (7 -1Y -1)xv =0, (17) 


then the processes H and S = Sog (B ) are local martingales with respect to the 
measure P such that dP; = ZdP:, t > 0. 


EXAMPLE. Let H be the Lévy process considered in the example of §3d. Let 
Gs(w) = 8 and let Y = Y (x). Assume also that 
1 ; 
b+ (5 +8) o? 4 fe — 1 — g(£)) F (dz) + fe ~1)(Y — 1) F(dz) =0. (18) 


Then the processes H and S = So6(H) are P-local martingales. 


Note that condition (18) can be also written as follows: 
1 2 z 
b+ (5 + B)o + | ((e* —1)Y — g(z)) F(dx) = 0. (19) 


For @ = 0 aud Y = 1 this is the same as condition (24) in §3d. 
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We recall the representation for the cumulant function y(A) in § 3c: 


2 
p(A) = àb + <0? + fe — 1 -Ag(x)) F (dz). (20) 


Setting @ = \ and Y (£) = eò”, we find from (20) that (19) holds if X is a root 
of the equation 
pA +1) -= pA) = 0. (21) 


. S ; ; 
If By = Boe™, then the discounted prices 3 form a local martingale with respect 


to the measure P if dP, = Z, dP; and dZ = Zt— dN;, where 
N =Ñ. H° +(e -—1)* (u-v) (22) 
and J is the root of the equation 
p(A +1) — pA) =r. 


$ 3f. Representability of Local Martingales 
(‘(H°, u—v)-Representability’) 


1. We assumed in the previous section that the density process Z = (Zt)¢>0 with 


dP 
At = T (which is a P-local martingale) has a representation Z = (N), where 
t 
the P-local martingale N is a sum of two integrals, with respect to H° and u — v 
(see (13)). 


Comparing this with the ‘(u—v)-representability’ in the discrete-time case 
(Chapter V, S4c) we see that the term ‘(H°, u—v)-representability’ fits very well 
in this context, so that we use it the title of this section. 

The issue of the representability of local martingales is considered in full gener- 
ality in [250; Chapter III, § 4c]. Hence we discuss here only several general results 
related directly to arbitrage, completeness, and the construction of probability mea- 
sures that are locally absolutely continuous with respect to the original measure. 


2. We note first of all that to answer in a satisfactory way the question on the 
representation of local martingales in terms of the local martingale H° and the 
martingale measure u — v we must impose certain additional restrictions on the 
structure of the space 2 of elementary outcomes w. Namely, we shall assume in what 
follows that Q is the canonical space of all right-continuous functions w = (wt)t>0 
that have limits from the left. (See also [250; Chapter III, 2.13] on this subject.) 

We shall assume all the processes X = (X:(w))tż0 below and, in particular, 
semimartingales to be canonical (i.e., X¢(w) = wt). 
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We shall take for a filtration (F;)¢50 the family of o-algebras 


Gr =) FP, 


s>t 


where ¥2 = o(w: wu, u S 8). We also set F = V Fa. 

Let P be a probability measure on (Q, F), Pe = P| F, t > 0, and let H = 
(Ht, Ft)tz0 be a semimartingale with triplet of predictable characteristics (B,C, v). 
For simplicity we assume that Hp = Const (P-a.s.). 

The question whether the triplet (B,C,v) defines the measure P unambigously 
is of interest in many respects. This is not the case in general, as can be seen in 
the most simple, ‘deterministic’ examples. 

For instance, let H = (Hz)¢y0 be a solution of an (ordinary) differential equation 


H = 2| H412, Ho =0 


(with non-Lipschitz right-hand side). Obviously, this equation has two solutions, 
HY!) = 0 and HO = t?. They are both semimartingales with respect to the 
measures P(!) and P(2), where the first is concentrated at the trajectory uw, = 0 and 
the second at w, = t°. At the same time, their corresponding triplets (B,C, v) are 
the same: C = 0, v = 0, and Bi(w) = fi 2w? ds. 


3. The role played by triplets and the uniqueness of probability measure in the 
problem of ‘(H°, p—v)-representability’ is revealed by the following result. 


THEOREM 1. Let H = (Ht, ¥t)t>0, Ho = Const, be a semimartingale with triplet 
(B,C,v) on a filtered probability space (Q, F, (Ft)tz0,P) and assume that the 
measure P is unique in the following sense: if P’ is another measure such that H 


lo 
has the same triplet with respect to it, P’ < P, and Po = Po, then P’ =P. 
Then each local martingale N = (Ni, Ft) has a representation 


N=No+f-Ho+W x(u- v), (1) 
and W isa P-predictable 


where f is a predictable process with f°- (H°) € Ai 
process with G(W) € gt (§3a). 


loc 


The proof of this result and its generalization (‘the Fundamental representation 
theorem’) can be found in [250; Chapter III, § 4d]. 

One can deduce from this theorem the following results concerning ‘(H°, p — v)- 
representability’. (They are useful, in particular, for complete arbitrage-free mod- 
els) 


700 Chapter VII. Theory of Arbitrage. Continuous Time 


THEOREM 2. Let (Q, F,(Ft)tz0,P) be the canonical filtered probability space. 
a) If H = (At, ¥:)¢>0 is a Brownian motion. then each local martingale N = 
(Ni, Ft)ez0 has the following form: 


where f? - (H) € oft 


loc’ 
b) If a semimartingale H = (Hi. F+) has independent increments, then each 


local martingale N = (Ni, Ft)tzo has a representation (1). 


The proof is an immediate consequence of Theorem 1 in view of the unique- 
ness of the Wiener measure and the following fact: processes with independent, 
increments have deterministic triplets, which uniquely define the probability distri- 
butions (by the Lévy-Khintchine formula). 

Note that we have already encountered assertion a) (in Chapter III, § 3c). 


4. Apart from ‘classical’ cases a) and b) of Theorem 2, we shall now discuss briefly 
another case of the ‘(H°, u.—1)-representability’ of local martingales. 
We consider the stochastic differential equation 


dH = b(t, Hi) dt + o(t, Hy) dBi 
+ g(O(t, At, t)) (u(dt, dx;w) — v(dt, dx;w)) + g' (ô(t, ht, x))u(dt,dx;w) (3) 


(cf. Chapter III, § 3e), where b, 7, and 6 are Borel functions, g = g(x) is a truncation 
function, g'(x) = x — g(x), B is a Brownian motion, and p is homogeneous Pois- 
son measure with compensator v(dt,dx) = dt F (dx) (§3a). It is well known (see, 
e.g., [250; Chapter III, § 2c]), that for (locally) Lipschitz coefficients satisfying the 
condition of linear growth stochastic differential equation (3) (with initial condition 
Ho = Const) has a unique strong solution (Chapter III, §3e). Moreover, whatever 
the initial probability space ou which both Brownian motion and Poisson measure 
are defined, the probability distribution of the solution process H on the canonical 
space (Q, F) is uniquely defined. 
The process H is a semimartingale with triplet (B,C,v), where 


t 
Bw) = f b(s,ws) ds, 


t 
Ci (w) = a? (s, ws) ds, 
0 
y(dt,dx;w) = dtKi(wt, dx) 


and Ke(we, A) = f La 40} (8l, wt 2) F(dz). 

Hence, if the coefficients satisfy the above-mentioned conditions (the local Lip- 
schitz condition and the condition of linear growth), then each local martiugale 
N = (Ni, Ftjtz0 admits an ‘(H°, y—v)-representatiou’. (See (250; Chapter II, 
§ 2a] for greater detail.) 
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§3g. Girsanov’s Theorem for Semimartingales. 
Structure of the Densities of Probabilistic Measures 


1. If M = (Mi, Ftjtzo is a local martingale on (Q, F, (Ft)tz0,P) and P RE P, 
then M is a semimartingale with respect to this measure (see (7) in § 3e). 
Remarkably, each semimartingale is transformed into a semimartingale again 
by such a change. That is, the class of semimartingales is stable under locally 
continuous changes of measure. (This is an easy consequence of Itô’s formula for 
semimartingales; see Chapter III, § 5c.) 
From the standpoint of arbitrage the following question is of particular interest 


for financial mathematics: what can be said of the measures P such that P es P 


~ loc a i 
or P <P and the semimartingale X in question (describing, e.g., the dynamics of 
prices) is a local martingale or a martingale with respect to P? 


2. One possible approach to this problem is to describe how the canonical repre- 
sentation (with respect to P) 


X = Xo+ B+ X°+g%(w—v)t (x-g(z)) * u (1) 


of a semimartingale X with triplet (B, C, v) transforms under a locally continuous 


~ | 
change of measure P < P into the canonical representation 
X = Xo+ B+ X° + gx (u~?) + (z-g(2))* u (2) 


(with respect to P) of the same semimartingale with new triplet (B, C: D). 


Let Z = st t> 0. We set 
t 
_ d(Z°,X°) I(Z_ > 0) 
B= xe xe) (3) 
_pP( Z 5 
Y= eb (ANZ >0)|P), (4) 


where EF is averaging with respect to the measure M i on (Q x Ry x E, 
F & B(R,) ® £) defined by the formula W x MP = E(W x u) for all nonnegative 
measurable functions W = W(w, t,x). (Cf. the definition of Yn (z, w) and Mn (dz, dw) 
in Chapter V, §3e). 

The processes 8 aud Y are crucial in the issue of the transformations that triplets 
undergo under changes of measure. The following result is often called Girsanov’s 
theorem for semimartingales. 
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zi dP 
THEOREM 1. Assume that P < P,Z= Ti t > 0, and let the processes 3 and 
t 


Y be defined in terms of Z = (Zt)t>0 by formulas (3) and (4). 
Then B, C, and v, defined by the equalities 


B=B+B-C+g(t)(Y — 1) *v, (5) 
C=C, (6) 
yp=Y-yp, (7) 


make up a triplet of the semimartingale X with respect to the measure P. 


The proof of this result (not exclusively in the above case of one-dimensional 
semimartingales, but also for several dimensions) is presented in [250; Chapter III, 
§3d] and is fairly complicated technically. Referring to this monograph for detail 
we comment now on the meaning of this theorem. 

Note first of all that the corresponding result in the discrete-time case was proved 
in Chapter V, §3e, where we explained the meaning of the discrete (relative to time) 
analogs of the measure M, E and the variable Y. 

Assertion (5) displays the transformation of the ‘drift’ component B in the 
triplet (B,C, v). 

Assertion (6) says that the quadratic characteristics of the continuous martingale 
component X° do not change in fact under an absolutely continuous change of 
measure (up to P-stochastic equivalence). 

Assertion (7) means that Y is just the Radon-Nikodym derivative of » with 
respect to v. 


3. If X is a special semimartingale, then we can set g(x) = x in the canonical 
representation (2), so that 


X=Xj9+B+X°+aux(u-v). (8) 


Hence we see that X is a local martingale if B = 0. Taken together with 
Theorem 1 this observation brings us to the following result. 


~ loc x : 
THEOREM 2. Let P < P and, moreover, (a? A |z|)» © € A Then a special 
semimartingale X is a local martingale with respect to the measure P if 


B+p-C+2(¥ -1)*v=0. (9) 


4. Formulas (3) and (4) show the way for finding @ and Y once the process 
Z = (2t)t>0 is knowu. The converse question comes naturally: how, knowing 
G and Y, can one find the corresponding process Z? 
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A solution of this problem opens a way to the construction of the measure P 
such that X is a local martingale with respect to it. For if 8 and Y satisfy (9) and 
we reconstruct the corresponding process Z, then, of course, taking the measure P 
with dP-p = Zp dPr we see that X = (X:):<r is a local martingale on [0, T]. 

Let X be a semimartingale defined on the canonical space (Q, F, (Fijiz0, P). 
Assume that each P-martingale M admits an ‘(X°, u—v)-representation’: 


M=Mo+f-X°+Ws«(u-v). (10) 
(See formula (1) in § 3f.) 
~l 
THEOREM 3. Let P < P, let Z = (Z:)tz0 be the density process, let v({t} x 
E;w) = 0 for t > 0, and let 8 and Y be defined by (3) and (4). Then (given the 
property of (X°, p—v)-representability’) the process Z satisfies the relation 


Z = Zo + (Z-6)- X° + Z- (Y — 1) * (u= v). (11) 


If 
B? - (Xj + (1 -— VY }? * n < oœ (12) 


for all t > 0, then the process N = (Ni)t>0o with 
Ni =B- XẸ + (Y — 1) * (u-v): (13) 
is a P-local martingale. The process Z = (Z¢})t>0 is a solution of Doléans’s equation 
dZ = Z- daN (14) 


and can be represented in the following form: 


Zi = ZEN), (15) 
where iza 
ENJ = etma O TT AANA. (16) 
O<s<t 


In this statement we assume that v({t} x E;w) = 0. This means that the 
process X is quasi-left-continuous, i.e., for each predictable stopping time r we have 
AX, = 0 on the set {r < oo}. In the general case this theorem is stated and proved 
in (250; Chapter III, § 5a]. 


Remark. As regards direct applications of Theorems 2 and 3 to diffusion models, 
see the next section, § 4a. 


4, Arbitrage, Completeness, and Hedge Pricing 
in Diffusion Models of Stock 


§4a. Arbitrage and Conditions of Its Absence. Completeness 


1. We have already discussed at length the issue of the construction of probabil- 
ity measures making price processes martingales or local martingales (both in the 
discrete and the continuous-time case). Our interest in this issue is mainly a conse- 
quence of the fact that the existence of equivalent martingale measures allows one 
to say about the absence of opportunities for arbitrage in a fairly general context 
(see §§2b,c). In addition, the knowledge of all such measures enables one, for 
instance, to find the fair (rational) prices or hedging strategies, and so on, using 
the machinery of martingales. 

In the present section we consider the issue of the absence of arbitrage in the 
case when prices are /t6 processes (Chapter III, § 3d). 


2. Let (Q,¥,P) be a probability space with a Brownian motion B = (By)t30. We 
shall denote by (¥1):30 the Brownian ( Wiener) filtration, i.e., the flow of a-algebras 
Fi = 0(FPUN), where F? = o(Bs,s < t) and M = {Ae F: P(A) = 0}. (See 
Chapter III, § 3a for detail). In addition, we assume that F = V Fi (= o (U F1)). 
The filtered probability space (Q, F, (Ft)tz0, P) satisfies the usual conditions 
(Chapter III, § 3a) and we shall regard it as the stochastic basis describing the 
probabilistic uncertainty and the structure of the flow of the incoming information. 
Let St = Soe™ (So > 0) be the price process for an asset, (some stock, say) 


with : 2 : 
— 2202 
H; = i (us 5 ) dx+ I os dBs, (1) 


where u = (ut, Ft) and o = (ot, Ft) are two stochastic processes satisfying (P-a.s.) 
the conditions 


t t 
f, iwslas < oœ, [2d <o, t>0. (2) 
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By It6’s formula we obtain 
dS, = St dhr, (3) 
where 
p t t 
H; z Hs as+ f os dBs, (4) 
0 0 
i.e., S has the stochastic differential 


dsi = Se (ut dt + ot dB;). (5) 


If we = u and ot = o Æ 0, then we obtain the standard diffusion model of 
Samuelson [420] describing the dynamics of stock prices by means of a geometric 
Brownian motion (Chapter III, § 4b): 


dS; = St(udt + o dBi). (6) 
We set 
23 u Lp? 
PPO aa) o 


Then EZ; = 1 and by Girsanov’s theorem (see Chapter II, §3b or § 3e), for each 
T > 0 the process S = (St, Ft)t<r is a martingale with respect to the measure Py 
with 

dPr = Zr dP, (8) 


where Pp = P| Fr; its differential is 
dS; = 0S, dBi, (9) 


where B = (By, Ft)i<7 is a standard Brownian motion with respect to Pr. 

Thus, if EZ; = 1 then the measure Pr on (Q, Fr) is equivalent to Py and the 
process S = (St, F¢)t<7 becomes a martingale with respect to Pr. 

It is worth noting that this measure Pr is unique in the following sense: if Qr 
is another measure such that Qr ~ Pr and S = (St, Ftji<r is a local martingale 
with respect to Qr, then Qr = Pyr. This is intimately counected with the result 
on the representation of local martingales in terms of the Brownian filtration (see 
Chapter III, § 3c); we shall prove it below, in subsection 5. 


’ 
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3. We consider now the case when the price process S has the differential (5). 
Assume that the following conditions are satisfied: P-a.s. we have 


or > 0 (10) 


[eyes < œ. (11) 


Then there exists a well-defined process Z = (Zt)tp0 with 


t t 
Hs 1 Hs \ 2 
= dB 
Zt exp Ls 5 aL ds), (12) 


that is a positive local martingale, and the corresponding localizing sequence (Tn)n>1 
can be taken in the form 


m = inf t: [E as> nh. (13) 


If EZp = 1, then Z = (%):c7 is a martingale, the measure Pr such that dPy = 


ZrdPr is a probability measure, Pr ~ Pry, and the process S = (St, Filtcr is a 
local martingale with respect to this measure. 

To prove the last assertion we use Theorem 2 in § 3g. 

Since 


for t > 0 and 


dZ, = —Z, dB, (14) 
Ot 


if follows that (see formula (3) in § 3g) 


c QC a.g 1 
B= a Z°, S°) Aa cel PR (15) 
d(S¢, S°) (Stat)? ot Stot 


In the triplet (B, C,v) of the semimartingale S with respect to the measure P we 
have v = 0, 


t t 
B; = 1 Sufudu, and Q = | S202 du. (16) 
0 0 


However, 


t t 2 2 
Bex: Bdge Í [suru - Husu | du=o 
0 0 


Ou ` Sulu 


Hence, by Theorem 2 in § 3g the process S = (St, Ft)i<r is a local martingale with 


respect to Pr. Moreover (assuming that EZp = 1), the measure Pr is unique in 
the same sense as in the case of wz = u and ot = a (see subsection 5 below). 
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4. Now, for a market model consisting of a bank account B(0) = (B;(0))ts0 with 
B,(0) = 1 and stock S = (St)t>0 with dynamics described by (5), we state condi- 
tions ensuring the absence of arbitrage (in its NA4-version; see § 2a). 

Let r = (8, y) be a strategy and let X7 = (X/):30 be its value, 


Xf = Be + yest- 


If 7 is a self-financing strategy, then 


t 
Xf = XG + Yu dSu, (17) 
0 


which, of course, implies that the stochastic integral in (17) must be well defined. 
As is clear from § 1a, the integral in (17) is well defined for y € L(S). In the 
current model (5) it would be reasonable to state conditions of the integrability of 
y with respect to S directly, in terms of the processes (ut)t<r and (ot)t<T- 
We assume that conditions (2), (10), and (11) are satisfied and 


t 
yo%du<oo (P-as.), t>0. (18) 
0 


By the last condition and since S = (St)t>0 is continuous, it follows that 
t A t 
hBo du < œ (P-a.s.), and therefore, the stochastic integral f Swe dBu 


with respect to a Brownian motion is well defined (see Chapter III, § 3c and § 1a in 
the present chapter). 


Further, 
t 2 t t 2 
(| ruta cu) < | (wou)? du: f (=) du. 
0 0 0 ‘Ou 


Hence it follows from (11) and (18) that the integrals frutu du and i SuYukltu du, 
t > 0, are well-defined and finite (P-a.s.). 

Thus, conditions (2), (10), (11), and (18) ensure the existence of the stochastic 
integral in (17). 

The next result, which is an immediate consequence of the implication (9) in 
Theorem 2 in § 2b is the best known assertion concerning the absence of arbitrage 
in diffusion models. 


THEOREM. Assume that stock prices S = (St)t>0 have the differential (5) and 
conditions (2), (10), (11), and (18) hold fort < T. 

Let EZp = 1. Then the property NA, holds and, in particular, there exist no 
opportunities for arbitrage in the class of a-admuissible self-financing strategies for 
any a > 0. 
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5. We turn now to a result already mentioned in subsection 2: the measure Pr such 
that dPr = ZrdPr, where Zr is defined in (7), is unique in the following sense: 
this is a unique measure equivalent to Py such that the process S = (St, ¥t)rcr is 


a local martingale with respect to Pr. 

We shall consider a more general case, where we assume that S is as in (5) and 
the process Z is defined by (12). 

Let Qr be a measure equivalent to Pp such that S = (St, ¥%)¢er is a local 
martingale with respect to Qr. 

Then we construct the martingale 


It is positive, therefore there exists a process Y = (Pt, Ft)¢c7 such that 


t 1 t 
N= exp( f psdBs-3 | vids), t<T, (20) 
0 J0 


with [ y2 ds < o (P-a.s.), and ENp = 1 (see (20) in Chapter HI, § 3c). 

We now use Theorem 1 in §3g, which describes the transformation of the 
triplet of predictable characteristics of semimartingales under absolutely contin- 
uous chauges of measure. 

Let (BP, CP, vP) be the triplet of S with respect to P. It follows from (5) that 


t t 
BP =| Suuudu, CP =| S2o?du, and v’ =0. (21) 
0 0 
By Theorem 1 in § 3g the triplet (B9, C&, vA) (with respect to Q) is as follows: 


«t 
Be = BP + Bu dOP, 
0 
Cac. 
vÊ = 0, (22) 


where 
a = UNG SH Lv 
t USE, 8°), Nie Stor’ 


Hence, from (21) and (22) we obtain 


t 
Be = f Sullu + ucul du. (24) 
0 
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Since S = (St, Fihigr is a local martingale with respect to Qr, it follows that 
Be = 0 (P-a.s.) for t < T. Hence we see from (24) that 


_ Huw) 
Cu(w) 


Pu(w) = 


(A x Pr)-a.s. on [0, T] x Q, where À is Lebesgue measure. 

Hence the processes Z = (Z)tcr and N = (Ni)i<r are stochastically indis- 
tinguishable, which shows that there exists a unique measure Pr ~ Py such that 
S = (St, Fijegr isa P-p-local martingale. 


6. We discuss now the issue of T-comnpleteness (sce the definition in § 2d). Assume 
that the assumptions of the above theorem are satisfied. Assume also that for our 
(B(0),S)-market we have B;(0) = 1 and the process S = (S:);<7 is a martingale 
with respect to the measure dPr = Zr dPy. Then by the uniqueness property of 
the (locally martingale) measure Py established above and in accordance with the 
theorem in § 2d, our diffusion model is T’-complete. 

A classical example of a T-complete (and arbitrage-free, in the NA, and NAg- 
versions) model is, of course, the model of a geometric Brownian motion (6), which 
is a major factor in its popularity in financial mathematics and financtal engineering. 


§ 4b. Price of Hedging in Complete Markets 


1. We explained the notion of hedging and methods of ‘hedge pricing’ in complete 
and inconrplete markets for discrete-time case in Chapter VI. 

In general semimartingale models the discussion can proceed along a parallel 
route, except, maybe, that we must describe concisely the classes of admissible 
strategies. 

We shall stick to the diffusion model of a (B(0),S)-market described in §4a.4 
and use the notation from there. 

Modifying slightly the definition of T-completeness given above (§ 2d) we shall 
say that a nonnegative Fp-measurable pay-off function fr with EZr fr < œ can 
be replicated if there exists a strategy m € II,(S) such that XF = fr (P-a.s.). 
Clearly, if fr is bounded, then the condition EZ; fr < œ is satisfied. 


DEFINITION. If a pay-off function fp can be replicated, then we mean by the price 
of (perfect) European hedging (cf. the nomenclature in Chapter IV, § 1b), or simply 
the hedging price, the quantity 


C(fr;P) = inf{z > 0: In € 14(S) with Xô = 2, XF = fr}. (1) 
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2. THEOREM. Let Pr be a unique martingale measure. Then 


C(fr;P) = Ep fr (=EZrfr). (2) 
Proof. If m = (G,y) is an a-admissible self-financing strategy, then 
t 
xp = x5 + f Basie ae (3) 
0 


and (by the Ansel-Stricker theorem; see §1a.6) X7 = (Xf)icr is a P-p-super- 
martingale, so that 
E5 XT < XG. (4) 


Hence if Xp = fr, then Ep fT < XG and 
EZrfr = Eg fr < C(fr;P). (5) 


We claim now that there exists a 0-admissible self-financing strategy 7 of initial 
value Xf = EZr fr that replicates fr, i.e., XẸ = fr (P-as.). 
We consider the process 


Xi =E(Zrfr| Fi), t<T. (6) 


Clearly, X = (Xt, Ft)i<r is a martingale with respect to the ‘Brownian filtra- 
tion’ and by the representation theorem (see Chapter III, § 3c) there exists a process 


T 
Y = (Vit, Filer with I, w? ds < oo such that 


t 
Xt = Xo +f ws ABs. (7) 
Note that the process (ZT X)ecr replicates fr: 
Z7 Xp = fr. (8) 


We now claim that there exists a 0-admissible self-financing portfolio 7 = (3,7) 
such that "s 
XP = 2,1 Xt. (9) 


Since 
da, Sa gps. 10) 
Ot 
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it follows by Itô’s formula (Chapter II, § 3d) that 


d(Z, 1) = Z, P(E dt + — Ht a.) (11) 
and 
d(Z, 4X.) = Z7 dX: + Xid(Z_ 1) + d(Zp") dX, 
= Z (he dBi) + XZ, (E a+ : aB; ) + z! (=u) dt (12) 


= 2; We (dB: + Pt dt) + Xo t (aB; gyt Mt at) 
Ot 


= S;(o4 dBi + m dt) [srz (2 TS ea wj, (13) 
o? 
We set 
nesia (#+ xe), (14) 
of 
Then we see that 
A(Z} Xt) = % dS, (15) 
and moreover, 
J Zo? du<œ (P-a.s.), t<T. (16) 
0 
(Cf. condition (18) in § 4a.) 
Thus, fort < T, 
t 
ZP`X, = E(Zrfr) + | FudSu: 17) 
Setting 
Bi = ZT Xi — H St, (18) 


we see from (15) that the strategy 7 = (B, Ñ) is self-financing and of value X7 = 
(XP )icr such that 


X = E(Zrfr) (19) 
and 


XË =Z! XR,  XŽ= fr. (20) 


From (5), comparing (19) and (20), we derive required assertion (2). Note that 
the strategy 7 so constructed is 0-admissible because Xf 20,t<T 
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§ 4c. Fundamental Partial Differential Equation of Hedge Pricing 


1. We consider the model of a market formed by a riskless asset—a bank account 
with zero interest rate (B;(0) = 1)—and a risk asset S = (St, ¥:)t>9 whose dy- 
namics is described by relation (5) in § 4a. 

As follows by the theorem in §4b, the hedging price C(fr;P) and the corre- 
sponding hedging portfolio 7 = (8, y) cau be found on the basis of the properties 
of the process Y = (Yt, Ft)tc7, where 


Yı = Z7 E(Zr fr | Fe). (1) 
Iu addition, the quantity 
Yo = E(Zr fr) (2) 
is precisely the price C( fr; P), 
Yr = fr, (3) 


and Y; = KT i.e., Y; is the value of the hedging portfolio at time t < T. (These 
properties justify the name ‘hedging-price process’ for Y. As already mentioned on 
several occasions, our method of finding C( fr; P) is usually called the martingale 
method.) 

In many cases we can explicitly find E(Zp fr) aud, therefore, the price C( fr; P). 
In particular, this can be done in the Black--Merton-Scholes model, where u and 
ot are constants. (See Chapter VIII below). 


2. F. Black and M. Scholes [44], and R. Merton [346] (1973) proposed another 
method for finding the price C(fr;P) and the hedging strategy. It is based on the 
so-called fundamental equation that they have obtained. 

This method, which is now widely used in financial mathematics (see, in partic- 
ular, § 5c below), is essentially as follows. 

We consider the process Y; defined in (1). Since 


1 T u(u, Su) 1 fT / ulu, Su) g 
m'amo- gessa (es) t) O 
aud S = (St, Ftje<r is a Markov process, we see that if fr = f(T, Sr), then the 
process Y = (Yt, Ft)t<r is also Markov and Y; cau be represented as Y (t, S4), where 
Y (t,£) is a measurable function. 
In [44] and [346] the authors simply started from the assumption that the hedg- 
ing portfolio 7 = (8, y) exists and its value Y; (= XP) at time t depends only on 
the last value S; of the prices (rather than of the entire history (Su, u < t)). 


Another a priory assumption necessary for this method is that the function 
Y(t,z) is in the class C!?. This enables one to use Itô’s formula, which brings 
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one to the following stochastic partial differential equation (where for simplicity we 
drop the arguments of functions): 


dY = (f+ Ws 45 : 0252S) at + 08% dB. (5) 
We consider now another representation for Y: 
Y (t, St) = XP = pi + Si. (6) 
In view of self-financing, 
dY = % dS; = Silu dt + ot dBi). (7) 


Using the representations (5) and (7) of the special semimartingale Y = Y (t, S4) 
aud the uniqueness of the representation of special semimartingales (see Chap- 
ter III, § 5b) we see that the coefficients of dB (and of dt) in (5) aud (7) must be 
the same. 

Since S; > 0 for t > 0, it follows therefore that (P-a.s.) 


~ oY 


F= Sgt. 50) (8) 


n oY : a ian i 
moreover, the processes (%)¢<7 and (Fett so) are stochastically indistin- 
t<T 
guishable. 
Comparing the terms with dt in (5) and (7) and taking account of (8) and the 


~ ð , F : 
equality Ae = Y(t, St) — Stag (t S+), we obtain the following relation (that holds 


(à x P)-a.s., where À is Lebesgue measure in [0, T]): 


oY 8Y 
Ji — (t, s)+2 37 2i, Si) S? Erl S) = O<t<T. 


This, in turn, must hold if Y = Y (t, S) satisfies the following Fundamental partial 
differential equation for O < t < T and 0 < § < œ: 


a 5) + 30? (t, S1)? e (t, S) = (9) 


with boundary-value condition 
Y(T, S) = f(T, S), S> 0. (10) 


(Cf. the backward Kolmogorov equation (6) in Chapter III, § 3f). 
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We shall discuss the solution of this equation (which reduces—in any case, for 
a(t, s) = o = Const—to the solution of the standard Feynman—Kac equation; 
see (19) in Chapter II, § 3f) in connection with the Black-Scholes formula in the 
case of f(T, S) = (S— K)*, in Chapter VIII. 

Here we point out the following features of this equation. 

Assume that there exists a unique solution to (9)-(10). We find % by (8) and 
define Bı by setting T 

B= Y(t, St) — St. (11) 


Clearly, the value xe of the portfolio 7 = (B, y) is precisely Y (t, S¢). 
It is not a priori clear if this portfolio 7 constructed from Y (t, S) is self-financing, 
i.e., whether 


dY (t, St) = w d'St. (12) 
However, this is an immediate consequence of (5) and (9): 
oY oY 
4 = — dt eee 
dY (t, St) = St (1 55 dt + o a5 azs ) 
oY A 
T St gg (me dt + or dBi) = W dS¢. (13) 


Thus, assume that the problem (9)-(10) has a unique solution Y (t, S). Since 
Xf = Y(t, S+), it follows that X = f(T, Sr), while X§ = Y (0, So) is the initial 
price of the portfolio 7 = (8, Ñ). 7 

The following heuristic arguments show that the price Xf = Y (0, So) (obtained 
in the solution of (9)-(10)) has the properties of ‘rationality’, ‘fairness’, while the 
portfolio 7 = (B, y) is an ‘optimal’ hedge. 

In fact, let us interpret our problem as a search of a hedging strategy for a seller 
of a European call option such that the value of this strategy replicates faithfully the 
pay-off function f(T, Sr). Our solution of (9)-(10) shows that selling this option 
at the price C = Y (0, So) the seller can find a strategy 7, such that xe becomes 
precisely equal to f(T, Sr). 

Assume now that the price C asked for this option contract is higher than 
Y (0, So) and the buyer has accepted this price. Then, clearly, arbitrage is possible: 
the seller can obtain the net profit of C—Y (0, So) meeting simultaneously the terms 
of the contract because there exists a hedging portfolio of initial price Y (0, So) that 
replicates faithfully the pay-off function. 

On the other hand, if C < Y (0, So), then due to the uniqueness of the solution 
of (9)-(10) the terms of the contract will not necessarily be fulfilled (at any rate, if 
one must choose strategies in the Markov class). 

There are several weak points in this method based on the solution of the fun- 
damental equation, namely, the a priori assumptions of the ‘Markovian structure’ 
of the value of 7 (i.e., the representation X7 = Y(t, S;)) and of the Cl?-regularity 
of Y (t, S) (enabling the use of It’s formula). 
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Fortunately, there exist other methods for the construction of hedging strategies 
and finding the ‘rational’ price C(fr;P) (e.g., the ‘martingale’ method exposed 
in § 4b), which show, in particular, that a hedging portfolio does exist and its value 
has the form Y(t, St) with a sufficiently smooth function Y, so that equation (9) 
actually holds. We present a more thorough analysis of the case of a standard 
European call option with f(T, Sr) = (Sp — K)t in Chapter VIII, §1b, where 
we discuss and use both ‘martingale’ approach and approach based on the above 
fundamental equation. 


3. In the above discussion we assumed that the riskless asset (a bank account) 
B(O) = (B;(0))ts0 has the form B;(0) = 1. In effect, this means that we deal 
with discounted prices. However, in some cases one must consider the. ‘absolute’ 
values of the prices rather than the ‘relative’, discounted ones. Here we present the 
corresponding modifications in the case when the bank account B(r) = (Bi(r))es0 
(in some ‘absolute’ units) has the following form: 


Bi(r) = Bo(r) exo( f Ts is) (14) 


(here (r¢)¢50 is a deterministic nonnegative function—the interest rate), and the 
risk asset (stock) is S = (S¢(u,0))is0, So(u, o) = So > 0, where 


dSt(u,o) = Se(u,o) (me dt + ot dB). (15) 
(Our assumptions about ut, ot, and the Brownian motion B = (Bi, Ft)tpo are the 


same as in § 4a.) 
Let 7 = (8,7) be a self-financing portfolio and let 


XP = ABi(r) +% Silum, 0). (16) 
We assume that B 4 is as follows: 
Xf = Y(t, S:), 
where S; = Si(u, 0o) and Y(t, S) € C12. Then for Y = Y (t, S) we obtain the same 
equation (5). 
On the other hand, since 
dBi(r) = ry Be(r) dt, (17) 


it follows by the property of self-financing that 


dY (t, St) = AXP = (Fep Se + Bere Bi(r)) dt + Hr01S¢ dBi. (18) 
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Comparing the terms contaming dB in (5) and (18) we obtain again that 


and, as in our derivation of (9), we see that the coefficients of dt in (5) and (18) are 
the same if Y (t, S) satisfics the following Fundamental equation for S € Ry and 
O0<t<T: n 
oY ðY 1 2208Y 
ae tag t 2? > age = 
with boundary condition Y(T, S) = f(T, S), S € R4- 
It should be noted that u = u(t, S) does not enter the equation or boundary 
condition, and therefore Y (0, Sọ) is independent of u. This might appear puzzling 
at first glance. (On the other hand we can consider thìs property to be desirable 
because investors can have different ideas about the values of p and g, and therefore, 
different notions of the actual dynamics of the price process S = (5S;)¢39.) Probably, 
the best explanation can be given from the standpoint of the ‘martingale’ approach, 
in which the price satisfies the relations 


rY, (19) 


Y (0, So) = C(fr; P) = Ep f(T, Sr) 


(see formula (2) in §4b) and the process S = (S¢):<7 is a local martingale with 
respect to Pr (by Girsanov’s theorem for semimartingales), with dS; = Stot dBi, 
where B is a Brownian motion. 

Hence we see that the price C(fr;P) is independent of u. However, the depen- 
dence on the volatility ø does uot ‘wither away’ since the quadratic characteristics 
of continuous martingale components do not change under absolutely continuous 
changes of measure (see formula (6) in § 3g). 


5. Arbitrage, Completeness, and Hedge Pricing 
in Diffusion Models of Bonds 


§5a. Models without Opportunities for Arbitrage 


1. In Chapter III, § 4c we considered several models of the term structure of prices 
of bond families. In particular, we observed that there existed two approaches to the 
description of the dynamics of bond prices P(t, T): the indirect approach (when we 
take some ‘interest rate’ process r = (r(t))¢>0 as a ‘basis’ and assume that P(t, T) = 
F(t,r(t),T)). and the direct one (when P(t, T) is defined in a straightforward way, 
as a solution to some stochastic differential equations). 

These approaches bring forward distinct models. In the framework of our notion 
of a ‘fair’ market as a market without opportunities for arbitrage it would be natural 
to find out first of all what conditions ensure the absence of arbitrage in these models 
and how one can find ‘explicit’ expressions for the prices P(t, T) in these models. 


2. Taking the indirect approach we assume that the interest rate process 
r = (r(t)j>0 is the solution of the stochastic differential equation (cf. (5) in 
Chapter III, § 4c) 

dr(t) = a(t,r(t)) dt + b(t,r(t)) dWr, (1) 


generated by a Wiener process W = (Wi, Ft)tp0. (As regards the Brownian 
(Wiener) filtration (F;)t50, see § 4a.) We also assume that the coefficients a = a(t,r) 
and b = b(t,r) are chosen so that equation (1) has a unique strong solution (Chap- 
ter III, §3e). 

In a natural way, one can associate with the interest rate r = (r(t))¢50 a bank 
account 


Bir) = (Bi(r))i>0 


Bitr) = exo( | 20) as), (2) 


with 
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which, like in the case of stock or other assets, plays the role of a ‘gauge’ in the 
t 
valuation of various bonds. (We assume throughout that J, Ir(s)| ds < œ (P-a.s.), 


t> 0.) 

Let P(t, T) be the price of some T-bond (see Chapter III, § 4c), which is assumed 
to be ¥;-measurable for each 0 < t < T, and P(T,T) = 1. Below we assume also 
that the processes (P(t,T))ty0 are optional for each T > 0. Then, in particular, 
P(t, T) is F#;-measurable for each T > 0. By the meaning of P(t, T) as the price of 
bonds with P(T,T) = 1 we must also assume that 0 < P(t,T) < 1. 

Now let us introduce the discounted price 


P(t, T) 


P(t, T) = Bi) 


O<t<T. (3) 

Bearing in mind the result of the First fundamental theorem about the absence 
of arbitrage (Chapter V, §2b) and based on the conviction that the ‘existence 
of a martingale measure ensures (or almost ensures) the absence of arbitrage’ we 
assume that there exists a martingale (or risk-neutral) measure Pr on Fr such that 
Pr ~ Pp (= P| Fr) and (PE, T), Fiji<r isa Pr-martingale. Then we conclude 
directly from (3) that 


Ex 


p (PT) Fe) = P(T) t<T, (4) 


so that we have the following result. 


THEOREM 1. If there exists a martingale measure Pr ~ Pr such that the dis- 
counted process (P(t, T), Ft)r<r is a Pr-martingale, then 


P(t, T) = Ep, (cx(- ie r(s) ds) | %). (5) 


This is an immediate consequence of (4) and the condition P(T, T) = 1, for we 


have 
P(t,T) 


1 
a | F) = 
(ares :) Bi(r) 
which delivers the representation (5). 


We see from (5) that if r = (r(¢))t50 is a Markov process with respect to P, 
then the price P(t, T) can be written as follows: 


P(t, T) = F(t, r(t), T). 


The ‘absence of arbitrage’ imposes automatically certain restrictions on the function 
F(t,r, T) (see § 5c below). 
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Remark 1. We point out that the price P(t,7’) of bonds cannot be unambigoulsy 
evaluated on the basis of the state of the bank account B(r) and the condition of 
the absence of arbitrage (more precisely, of the existence of a martingale measure). 
In fact there is no reason for Pr to be unique, which means that P(t, T) can be 
realized (in the form (5)) in various ways depending on a particular measure Py. 
We note also that if, instead of P(T,T) = 1, we require that P(T,T) be equal 


to fr, where fr is #7-measurable and E5 Ean < oo, then by (4) we obtain 

T 

fr P(t, T) 

Ex F z= ———— 

Ae ‘J Blr)’ 
and therefore s 

P, T) = Es {in exp(- f r(s) as) | a}. (6) 
s t 


3. We proceed now from one fixed T-bond to a family of T-bonds: 

P ={P(t, T); 0<t<T, T>0}. 
DEFINITION 1. Let P be a probability measure on (Q, F, (Ft)tz0) with F = V Fr. 
We say that a measure pep (i.e., Pe ~ Pi for t > 0) is a local martingale measure 


for the family P, if for each T > 0 the discounted prices P(t, T) = eD, t<T 
t 


’ 


are local P-p-martingales. 


To define an arbitrage-free (B,P)-market formed by a bank account B and a 
family of bonds P we must, first of all, discuss the notion of portfolio (strategy) in 
this case. 


DEFINITION 2 ([38]). A strategy m = (6,y) in a (B,P)-market is a pair of a 
predictable process 8 = (f%)z>0 and a family of finite (real-valued) Borel measures 
y = (4t(-))es0 such that for all ¢ and w the set function y = y(dT’) is a measure 
in (R4, @(R+)) with support concentrated on [t,oo) and for each A € @(R+) the 
process (¥4(A))¢>0 is predictable. 


The meaning of 3 and y is transparent: f is the number of ‘unit’ bank accounts 
(at time t) and (dT) is the ‘number’ of bonds with maturity date in the interval 
(T,T + dT}. 


DEFINITION 3. The value of a strategy m is the (random) process X* = (X/"):30 
with 
oo 
Xf = (Bi + P(t, T) ye(dT). (7) 
t 


(We assume here that the Lebesgue-Stieltjes integrals in (7) are defined for all 
t and w.) 
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4, We define now a self-financing portfolio 7 = (G,) in a (B, P)-market. 

To this end, taking the direct approach (Chapter III, §4c) we assume that the 
dynamics of the prices P(t, T) can be described by the HJM-model; namely, for 
0<t<T and T > 0 we have 


dP(t,T) = P(t, T) (A(t, T) dt + B(t,T) dW;), (8) 


where W = (W+)¢50 is a standard Wiener process, which plays the role of the source 
of randomness. We must add to equations (8) the boundary conditions P(T, T) = 1, 
T > 0. (For a discussion of the condition of the measurability of A(t, T) and B(t, T) 
and the solubility of (8), see Chapter III, § 4c). 

In view of the equation 


dBi(r) = r(t) Bi (r) dt (9) 


we obtain by It6’s formula that the process 


= P(t, T) 
P(t,T) = 10 
(N= 35 (10) 

has the differential (with respect to t for each T) 
dP(t,T) = P(t, T) ([A(t, T) — r(t)] dt + B(t, T) dW). (11) 


We have said that a strategy m = (p, y) of value XF = bt Bt +ytSt in a diffusion 
(B, S)-market is self-financing if 


dxf = BidBi + yt dst, (12) 
i.e. 
t t 
Xf = XG +f Bu dBu + Yu du. (13) 
0 0 
In the present case of a diffusion (B,P)-market it is reasonable to say that a 
strategy 7 = (8, y) of value Xf = HBr) + f P(E, T) y(dt) is self-financing ([38]) 
if (in the symbolic notation) 
CO 
dxf = 6; dBi(r) +/ dP(t,T) (dT), (14) 
t 
which (in view of (8)) should be interpreted as follows: 


XP = Xo + [ Bs dBs(r) + [ be A(s,T)P(s, T) vw(a)| ds 


+f B(s, T)P(s, T) w(a)| dWs. (15) 
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If a strategy m = (8, y) is self-financing, then its discounted value 


2 XT 
XS ee 16 
t= BE (16) 
satisfies the relation as 
X = | dP(t,T) (aT). (17) 


As in (14), relation (17) is symbolic and (in view of (11)) it means that 
n wT t a 5 
X, = Xo +f J (A(s, T) - r(a))P(s. T) slan) ds 
0 8 


t o0 
+ f | / B(s,T)P(s,T) vs(at)| dW,. (18) 
0 8 
5. To state conditions for the absence of arbitrage in (B, P)-models we discuss first 
of all the existence of martingale measures. 

To this end we define the functions A(t, T) and B(t, T) also for t > T by setting 
A(t, T) = r(t) and B(t, T) = B(T,T). 

Then we immediately see from (11) that in order that the sequence of prices 
(P(t,T)):<r be a local martingale for each T > 0 with respect to the original 
measure P it is necessary that 


A(t, T) = r(t). (19) 
By (8) we obtain that in this case 
dP(t,T) = P(t, T)(r(t) dt + B(t,T) dW) 
and E B 
dP(t, T) = P(t, T)B(t, T) dWr. (20) 
Taking into account relations (14) and (15) in Chapter III, § 4c and the equality 


OA(t, T 
ga!) = 0 holding under the assumption (19) we obtain the following relation 


for f(t,T): 
df (t,T) = a(t,T) dt + b(t, T) dW, 


where 


alt, T) = b(t, T) T b(t, s) ds. 


On the other hand, if (19) fails, then it would be natural (by analogy with the 
case of stock) to refer to the ideas underlying Girsanov’s theorem. 
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Assume that besides the original measure P on the space (Q, F, (Ft)tz0) with 


F = \ Fi there exists a probability measure P such that P '& P, ie., Pe ~ Pe, 
t>0. 


dP 
We set Z = T Since (¥t)t>0 is the Brownian (Wiener) filtration, it follows 
t 


by the theorem on the representation of positive local martingales (see formula (22) 
in Chapter III, § 3c) that 


Zt = exp( f gls)dWs -3 | Peas), (21) 


t 
where the y(s) are Fs-measurable, J g? (s)ds < œ (P-a.s.), and EZ; = 1 for each 
t>0. 
By Girsanov’s theorem (Chapter III, § 3e) the process W = (W+)¢>0 with 


Eoy t 
= We [ pld: (22) 


is Wiener with respect to P. Hence 
dP(t,T) = P(t, T)[(A(t, T) + y(t) B(t,T)) dt + B(t, T) W:] (23) 
with respect to P and 
dP(t,T) = P(t, T)[(A(t,T) + p(t) B(t, T) — r(t))dt+ B(t,T)Wi] (24) 
(cf. (8) and (11)). M 
_ Hence (cf. (11)) the processes (P(t, T)):<7 are local martingales with respect to 
P for all T > 0 if and only if 
A(t, T) + g(t) B(t,T) — r(t) = 0. (25) 
By (23) we obtain in this case that 
dP(t,T) = P(t, T)(r(t) dt + B(t,T) dW), (26) 


where W = (W2)e>0 is a Wiener process with respect to P. 


6. The definition of the property of a strategy 7 = (8, y) in a (B, P)-market to be 
arbitrage-free (say, in the NA,-version) at an instant T is as in § 1c. We say that 
a (B,P)-market is arbitrage-free if it has this quality for all T > 0. 
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THEOREM 2. Let P&P be a measure such that the density process Z = (Zt)t30 
has the form (21) and condition (25) holds. 

Then there are no opportunities for arbitrage for any a > 0 in the class of 
a-admissible strategies 7 (Xt > —a, t > 0) such that 


2 


[ [x B(s,T) Y(T] ds <œ, t>0. (27) 


Proof. If (25) and (27) are satisfied, then the process X” = (Xt )es0 is a P-local 
martingale by relation (18), 

In view of the a-admissibility (X; > —a, t > 0) this process is also a su- 
permartingale. Hence if XG = 0, then EXT < 0 for each T > 0. However, 


P(XT > 0) = P(X > 0) = 1. Hence X% = 0 (P- and P-a.s.) for T > 0, which 
completes the proof. 


Remark 2. Assume that the function 


r(t) — A(t, T) 
Gers: P ae 
is independent of T and 
t /r(s) — A(s,T)\? 
I (Sa) ds < œ (P-a.s.) (29) 


for each t > 0. In this case, looking for a measure P with the property P Re P one 
could proceed as follows. 

We denote the function in (28) by y = y(t), t < T, introduce the process 
Z = (Z)t>0 defined by (21), and assume that EZ; = 1, t > 0. Then for each t > 0 
the measure P: with dP: = Z, dP: is a probability measure such that Py ~ Px. 


The family of measures {P;, t > 0} is compatible (in the sense that Ps = Py | Fs 


for s < t) and if there exists a probability measure P on (Q, F) such that p 12 P, 


then it is the required one. 

If the maturity dates of T-bonds in our (B, P)-market satisfy the inequality 
T < To, where To < œ, then we can take Pr, as the required measure P. 

It is equally clear that if Zæ = jim Z, and we have EZ% = land P(Z4 >0) =1, 


then the measure P with dP = Zo. dP is the required martingale measure with prop- 
5 loc 
erty P ~ P. 
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7. We present now an example of an arbitrage-free (B, P)-model. 
Following [36] and [219], we start from the forward interest rate f(t,T) with 
stochastic differential (with respect to ¢ for fixed T) 


df(t, T) = a(t, T) dt + b(t, T) dW, (30) 

where 
b(t,T)=a>0, (31) 
alt, T) So (Tosh. t<T. (32) 


Then (30) takes the following form: 
df(t, T) = 0?(T — t) dt + o dWr, (33) 


and therefore f 
F(t, T) = f(0,T) + o°t(T z 5) +W, (34) 
where f (0, T) is the instantaneous forward interest rate of T-bonds on the (B, P)- 


market (at time t = 0). 
By (34) and the definition r(t) = f(t, t) we obtain 


2 
r(t) = f(0,t) + ae + oW. (35) 


Hence the interest rate r = (r(t))¢>0 satisfies the equation 
t 
ays (ae + ot) di +0 dW. (36) 


(Cf. the Ho-Lee model (12) in Chapter III, §4c.) The coefficients A(t,T) and 
B(t, T) in (8) can be calculated on the basis of a(t, T) = 02(T — t) and b(t, T) =o 
in (33) as follows: 


T 1 T 2 
A(t, T) = r(t) = f a(t, s)ds + 3 g b(t, s) as) = r(t), (37) 
t t 
B(t,T) = —o(T ~ t). (38) 
Hence condition (19) holds in our (B, P)-model, and therefore the original mea- 
sure P is a martingale measure and no arbitrage is possible. 


The prices P(t, T) themselves can be found from the equation 


dP(t,T) = P(t, T)[r(t) dt — o(T — t) dW], t<T, 
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which must be solved for each T > 0 under the condition P(T,T) = 1. 
We can also use the equality 


T 
P(t,T) = exp(- f f(t, s) ds), t<T (39) 
t 
holding by (2) in Chapter III, § 4c. 
By (34), 
T T t 
f f(t, s) ds = f [ro s) +or(s — 3) ds + o(T — t)W: 
t t 
T: o2 
= f f(0,s)ds + itr —t)+o(T —t)Ws. 
t 
Consequently, 


T 2 
P(t, T) = exp - i f(0, s) ds — STL ~t)t+o(T—- nwa} 


o2? 
= eo exp] TTT -—t)+o(T - nwi}. (40) 


Hence we obtain from (35) the following representation for P(t, T) in terms of 
the interest rate r(t): 


P(t,T) = 


o2 


(Cf. affine models in Chapter III, § 4c and in § 5c below.) 


8. In the above diffusion models we have assumed that the interest rates r = (r(t)) 
the forward interest rates f = (f(t, T)), and the bond prices 


$ 


P = {P(t,T); 0< t <T, T <00} 


themselves have a single source of randomness: a Wiener process W = (W;)¢30. 

In the vast literature on the dynamics of bond prices the authors discuss also 
some other models, where one Wiener process W = (W;)z50 is replaced by a multi- 
variate Wiener process W = (w1, ...,W”). To take into account also jumps in the 
prices P(t, T) one invokes other ‘sources of randomness’: point processes, marked 
point processes, Lévy, and some other processes. 

Referring the reader to the special literature (see, e.g., [36], (38], [128], and 
the bibliography therein), we present here only a few models equipped with such 
‘sources of randomness’. 
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In [36] and [38] the authors generalize models of type (1) by introducing models 
of ‘diffusion with jumps’ kind: 


d 


dr(t) = asdt+ X- bi aWi + f R A (42) 
i=l 


where u = p(dt,dx) is an integer-valued random measure in R} x Q x E and 
Woes Ww?) are independent Wiener processes. 

One must also make the corresponding modification in the description of the 
dynamics of P(t, T) and f(t, T): 


d 
dP(t,T) = P(t,T) (aen) dt + S~ Bi(t,T) awi) 


i=1 


+ P(t-, T) | alts, T) w(at,de), (43) 


d 
df(t, T) = a(t, T)dt + Ñ b(t, T) dw} 


i=1 


+f ô(t, x, T) u(dt, dx). (44) 
E 


9. Following [128] we consider now models with Lévy processes as ‘sources of ran- 
domness’. 
To this end we consider first equation (20), which we rewrite as follows: 


dP(t,T) = P(t,T) dA(t,T), (45) 
where i 
A(t) =| [r(s) ds + B(s, T) dWs). (46) 
We also set 
t 2 s 
A(t,T)= I (ho = ZeD] ds + B(s,T) aws). (47) 


Then (see (9)-(13) in § 3d) we have the representations 
P(t, T) = P(0,T)E(A(-,T))t (48) 


and 
P(t, T) = P(0,T)e# 7), (49) 
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In view of (47), 


P(,T) _ 5 = Ate 
ae = P(0,T)exn{ | B(s,T) dW; z, B (s,T)as}. (50) 


If, for instance, the function B(s, T), s < T, is bounded, then we see that the 
expression on the right-hand side of (50) is a martingale. 
Now, we replace the Wiener process W = (W:)tpo with a Lévy process L = 


(Lt)t>0 (see Chapter II, §1b). What can be the form of the processes A(t, T) 
and H(t,T) if, in place of the integrals [B B(s,T)dWs, we consider now the in- 


P(t, T) = 


tegrals [ B(s,T)dL, interpreted as stochastic integrals over a semimartingale 
L = (Ls)s<r with bounded deterministic functions B(s,T)? 

If the functions B(s, T) are sufficiently smooth in s, then we can use N. Wiener’s 
definition 


t t ðB 
f B(s,T)dLs = BET) - f 22 (ai T)Lsds. 
0 o Os 


(See Chapter III, § 3c on this subject and see [128] in connection with Lévy pro- 
cesses.) 


Let 
2 


== ae eò? — 1 — Ag(x)) v(dz 
PA= ab For f (eo —1~ dale) ldo) (51) 


be the cumulant function (see § 3c) of the Lévy process L = (Li)tzo- 
We assume that the integral in (51) is well defined and bounded for all A such 
that |A| < c, where c = sup |B(s,T)|. 
sgT 
By the meaning of the cumulant function 


Fett: = etl), (52) 


t 
Let XT = [| B(s,T)dLs, t <T. The process XT = (XT jig has independent 
t 0 t 


increments and its triplet (BX Ox v”) of predictable characteristics can be 
found from the triplet (B’,C%,v) of L (see Chapter IX, § 5a in [128] and [250]). 
Using Ité’s formula we can see (see detail in [128]) that 


EX? = eo( f o(AB(s,T)) ds). 


The process (exp(à Xi T- f o( (AB(s,T))d 8) ) er is a martingale (cf. (11) in § 3c). 


Hence if we want that (P(t, T) ter be a martingale, then it seems reasonable to 
generalize (50) by setting 


P(t, T) = P(0,T) exo{ f B(s,T) dL, — i y(B(s,T)) ash (53) 
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Returning from P(t, T) to the process P(t, T) we obtain that the (B, P)-market 
with 
P(t,T)=P(0,T)eF47,  t<T, T>0, (54) 


where : 


t 
stad = f B(s, T)dLs + [r(s) — (B(s, T))] ds, (55) 
0 0 
has the following property: the discounted prices (P(t,T))tcr form a martingale 
with respect to the original measure P and there exist no opportunities for arbitrage 
in the class of a-admissible strategies in this market (cf. Theorem 2). 
Using the connection between H(t,T) and H(t,T) (see (10) in § 3d) we obtain 


2 t 
AET = HEDT f B?(s,T)ds+ X` (eP(T) ALs 1- B(s, T) ALs) (56) 
0 O<s<t 


and 
dP(t,T) = P(t—,T) dÂ (t, T). (57) 


Remark 3. Starting from equations for P(t, T) E. Eberlein and S. Raible [128] have 
analyzed the structure of forward rates and interest rates f(t, T) and r(t) and also 
considered in detail hyperbolic Lévy processes, i.e., Lévy processes such that the 
random variable Lı has hyperbolic distribution (see Chapter III, § 1d). 


§5b. Completeness 


1. Proceeding to the issue of completeness in (B, P)-models it is worth recalling 
that for discrete time n < N < œ and finitely many kinds of stock the completeness 
of an arbitrage-free market is equivalent (by the Second fundamental theorem) to 
the uniqueness of the martingale measure and to the existence of ‘S-representation’ 
for martingales (with respect to some martingale measure). 

For a diffusion (B, P)-market generated by an m-dimensional Wiener process 
W = (W!,...,W™) we have a multidimensional analogue of Theorem 2 in Chap- 
ter III, § 3c: each local martingale M = (Mt, Ft) admits a representation 


m t ; 
M, = My + S> | vi(s) aw: (1) 
i=1 
with #,-measurable functions 7;(s) such that 


t 
Í 0? (s) ds < œ (P-as.), t>0. 
J0 


As for (B,S)-markets (see § 4b), this representation plays a key role in the study 
of completeness in (B, P)-models. 
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Let Tp be a fixed instant of time and let fy, be a #7,-measurable pay-off 
function. We assume that fr, is bounded (|f| < C) and we shall say that this 
pay-off function can be replicated if there exists a self-financing portfolio 7 = (8, y) 
such that 

XT, = ft (P-a.s.). (2) 


If this property holds for each Ty and each bounded Fry -measurable pay-off 
function fr), then we say that our (B, P)-model is complete. 
Assume that the prices P(t, T) of T-bonds satisfy the relations 


dP(t,T) = P(t,T) (re dt + y B;(t, T) awi) . (3) 


i=l 


We also assume that 0 < B;(t, T) < C = Const. 
Then the original measure P is martingale in the following sense: the family of 


prices (P(t, T)):<7 is a local martingale and 


X; = Xo + 3 I i B;(s,T)P(s,T) Y(T) dwi. (4) 
We set 
m = E(t | %). t< To. (5) 


Then M = (Mi, F4) admits the representation (1) and comparing with (4) we 
see that to replicate the variables M;, t < To, by the value xe, t < To, of some 
self-financing portfolio m it is necessary and sufficient that ([38]) 


To _ 
vil) = f B,(t,T)P(t, T) (dT) (6) 


((dPxdt)-a.s.) fori =1,...,m and t < Tọ. 
If there exists a solution {yž (dT), t < To, T < To}, then setting 


To 
p= Mi- | PUT) at), (7) 
we see that m* = (8*,y*) is a self=financing portfolio such that 


X =M, t< To. 


fto 
Bry (r) í 


In particular, Xt, = and therefore XE = fr (P-a.s.), i.e., we have 


To-completeness. 


730 Chapter VII. Theory of Arbitrage. Continuous Time 


2. EXAMPLE ((36], [38]). Assume that there are finitely many, specifically, d bonds 

in a (B, P)-market having maturity times T1, ..., Tg. (Hence the supports of the 

measures y¿(dT) are concentrated at the points {T,},...,{Ta}.) There are m 

‘sources of randomness’ and it seems plausible that we need sufficiently many kinds 

of bonds to replicate the pay-off function fy): d must probably be not less than m. 
Let d= m. Then the system (6) takes the following form: 


d 
vilt) = X By (t,T;)P(t, T;) nUT), (8) 


j=l 


where 7 = 1,...,d. 

It is clear from (8) that for each t < To this system has a solution if and only if 
the matrix || B;(t,7;)|| is invertible. 

If m = d = 1, then the system (8) turns to the single relation 


yilt) = By(t,T)P(t,T1) yT J), (9) 


which means that v(t) 
HITY = ——_ei 
ETD = ETP) 


for t < Tı and 7/({T1}) = 0 for Tı < t < To. 


§5c. Fundamental Partial Differential Equation 
of the Term Structure of Bonds 


1. By contrast to the direct approach to the description of the dynamics of bond 
prices P(t, T) by stochastic differential equations (see § 5a), taking the indirect ap- 
proach we assume that the prices P(t, T) have the following form: 


P(t, T) = F(t,r(t), T), (1) 


where r(t) is some ‘interest rate’ taking, as a rule, only nonnegative values. 

The indirect approach (1) was historically among the first few. Later on it has 
been overshadowed (first of all, in theoretical studies) by the direct approach. How- 
ever, as regards obtaining simple analytic formulas, the approach (1) has retained 
its importance and is still popular. 


2. It should be noted from the outset that this method works only under the 
assumption that the interest rate process (r(t))¢>0 is a Markov process satisfying 
the stochastic differential equation 


dr(t) = a(t,r(t)) dt + b(t, r(t)) dW; (2) 


or an equation of ‘diffusion-with-jumps’ kind (see equation (6) in Chapter III, § 4a). 
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We shall assume that for each T > 0 the function FT = F(t,r,T7) is in the class 
cl? (in t and r). Then 


ao te Se aor) ee ae (3) 


ƏFT oFT ort OFT 
T a 2 
dF = i ) T 


Assuming that FT > 0, we now rewrite this equation as follows (cf. equation (8) 
in § 5a): 


aFT = FT (AT (t,r(t)) dt + BT (t,r(t)) dW). (4) 
where 
ƏFT  ƏFT Te Orne 
T of TE EE 
A` (t,r)= FT E (5) 
and 
OFT 
BT (t,r) = =. (6) 


In finding additional conditions on the functions FT (besides the obvious condi- 
tion FT (T,r(T)) = F(T,r(T),T) = P(T,T) = 1) we shall be based on the condition 
that the (B, P)-market in question must be arbitrage-free. Then, comparing (4) and 
formula (8) in §5a and taking account of relation (25) in §5a we see that in order 
that the market be arbitrage-free there must exist a function y(t) such that 


AT(t,r)—r 


05 = —y(t) (7) 


for allt and T, t < T. (For each function y = y(t) we can construct the corre- 
sponding ‘martingale’ measure P by formula (21) in §5a.) 

In view of (5) and (6), we see from (7) that if the functions FT = F(t,r,T), 
T > 0, satisfy the fundamental equation 


OF OF 150F 
Tt (a+ ob) + 5b aT 


with boundary condition F(T,r,T) = 1, T > 0, r > 0, then a (B,P)-market with 
P(t,T) = F(t, r(t),T) is arbitrage-free. 

Equation (8) is very similar to the Fundamental equation of hedge pricing for 
stock (see (19) in § 4c). However, there is a crucial difference between these cases: 
the function y = y(t) in (8) cannot be defined in a unique way on the basis of 
our assumptions and must be set a priori. As pointed out above, the martingale 
measure P is defined in terms of this function. Hence the choice of the latter is 
equivalent to a choice of a ‘risk-neutral’ measure operative, in investors’ opinion, in 
our (B, P)-market. 


t<T, (8) 
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3. Following notation (11) of Chapter III, § 3f, where we discussed the forward and 
the backward Kolmogorov equations and the probability representation of solutions 
to partial differential equations we set now 


2 
L(s,r) = (a(s,r) + y(s)b(s,r)) 2 + =P (s.r) 2a ; (9) 


The operator L(s,r) is the backward operator of the diffusion Markov process 
r = (r(t))¢>0 satisfying the stochastic differential equation 


dr(t) = (a(t, r(t)) + p(t)b(t, r(t))) dt + b(t,r(t)) dWr. (10) 
Rewriting (8) as 


Be = L(s,r)F - rF, s<T, (11) 
Os 
we observe that this equation belongs (see Chapter III, § 3f ) to the class of Feynman- 
Kac equations (for the diffusion process r = (r(t))¢30). 
The probabilistic solution of this equation with boundary condition F(T,r,T) =1 
can be represented (cf. (19’) in Chapter III, § 3f and see [123], [170], [288] for detail) 


as follows: ‘ 
F(s,r,T) = Ean{exn(- f ET du) }, (12) 


where Esp is the expectation with respect to the probability distribution of the 
process (r(u))s<u<7T such that r(s) =r. 

Note that the formula (12), which we have derived under the assumption of the 
absence of arbitrage, is in perfect accord with the earlier obtained representation (5) 
in §5a, because in the Markov case we have 


efef- [ r du) |2) = E(ow(- f rw du) |r}: 


4. It is worth noting at this point that all the models of the dynamics of stochastic 
interest rates discussed in Chapter III, § 4a (see (7)-(21)) count among diffusion 
Markov models of type (10). 

Their variety is primarily a result of their authors’ desire to find analytically 
treatable models producing results compatible with actually observable data. 

As noted in Chapter III, § 4c one important subclass of such analytically treat- 
able models is formed by the (affine) models ([36], [38], [117], [119]) having the 
representation 

F(t,r(t),T) = exp{a(t, T) — r(t)A(t, T)} (13) 


with deterministic functions a(t,T) and f(t, T). 
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The model that we shall now discuss (borrowed from the above-mentioned 
papers) can be obtained as follows. 
Assume that in (10) we have 


a(t,r) + p(t)b(t,r) = a1 (t) + raa(t) 


and 
b(t,r) = \/b1(t) + rba(t). 


Then (8) takes the following form: 


OF OF 1 O° F 
pp t (a1 traz) oo + 5 (br trb) Be STF, t<T. (14) 


Seeking the solution of this equation in the form (13) with F(T,r, T) = 1 we see 
that a(t, T) and G(t,T) are defined by a1(t), a2(t), b1(t), and be(t) in accordance 
with the following relations: 


OB 1 2 = A 
ap +26- 5sbef =-1l, A(T,T)=0 (15) 
and 3 i 
Q 
aps a a(T,T) =0. (16) 


Relation (15) is the Riccati equation. On finding its solution £(t, T) we can find 
a(t, T) from (16), which gives us the affine model (13) with these functions a(t, T) 
and (t, T). 


EXAMPLE. We consider the Vasiček model (see (8) in Chapter III, § 4a) 
dr(t) = (@ — br(t)) dt + cdW;, 


where @, b, and z are constants. 
Then we see from (15) and (16) that 


g 285TL B(T,T) =0, (17) 
and 3 i 
OO e eare E2 z 
Ti aß zE Po a(T,T) = 0. (18) 
Consequently, 
1 — 
ptt, T) =30- e KT-1)) (19) 
and 
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1. European Options in Diffusion (B, S)-Stockmarkets 


§1a. Bachelier’s Formula 


1. The content of this chapter relates to continuous time, but, conceptually, it is in 
direct connection with our discussion of the discrete-time case in the sixth chapter. 

Here we shall be mostly interested in options. We take them as examples where 
one can clearly see the role of arbitrage theory and stochastic calculus and the op- 
portunities they give one in calculations related to continuous-time financial models. 


2. As mentioned before (Chapter I, §2a) L. Bachelier was by all means the first 
person to describe the dynamics of stock prices using models based on ‘random 
walks and their limit cases’ (see [12]), i.e., Brownian motions in the contemporary 
language. 

Assuming that the fluctuations of stock prices are similar to a Brownian motion, 
Bachelier carried out several calculations for the (rational) prices of some options 
traded in France at his time and compared their results with the actual market 
prices. 

Formula (5) below is an updated version of several Bachelier’s results on op- 
tions [12], which is why we call it Bachelier’s formula. 

In the linear Bachelier model one considers a (B, S)-market such that the state 
of the bank account B = (B;);<7 remains the same (B; = 1), while the share price 
S = (St)tcr can be described by a linear Brownian motion with drift: 


Sp=SotpttoW,, t<T, (1) 


where W = (W;)¢50 is a standard Wiener process (a Brownian motion) on some 
probability space (Q, ¥, P). 

Prices in this model can also take negative values, therefore it cannot adequately 
reflect real life. Nevertheless, its discussion could be of interest from different points 
of view: historically, it is the first diffusion model, on the other hand this model is 
both arbitrage-free and complete (see Chapter VII). 
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We now set 


oO 


Zr = exp(—4Wr- 5(#)"7) (2) 


and let F, t < T, be the o-algebra generated by the values of a Wiener process 
{W;, s < t} and completed by the addition of the sets of P-probability zero. 
We define the new measure Py on (Q, Fr) by setting 


dPr = ZrdPr (3) 
(cf. (8) in Chapter VII, § 4a), where Pp = P| Fr. 
Note that Py is a unique martingale measure in this model (see Chapter VII, 


§ 4a.5), i.e., the only measure such that Pr ~ Pr and the process S = (St)tcr isa 
P-martingale. Moreover, by Girsanov’s theorem (Chapter III, § 3e or Chapter VII, 
§ 3b), 

Law (So + ut +oWi;t < T|Pr) = Law(So +oWi;t < T|Pr). (4) 
THEOREM (Bachelier’s formula). The rational price Cp = C( fr; P) of the standard 


European call option with pay-off function fr = (Sr — K)* in the model (1) is 
defined by the formula 


Cr = (6 - K)0( BSE) + ovTo( 25) (5) 


oVT oVT 


where 


In particular, for So = K we have 


Cr = HER (6) 


Proof. Analyzing the proofs of the theorems in §§ 4a,b we can see that the re- 
sults obtained for the model (5) (with positive prices) in §4a remain valid for 
the present model (1). (The ‘key’ relation (14) in §4b assumes now the form 


Ýt = ge (M+ 4) with Z as in (2).) Thus, the (B, S)-market now is arbitrage- 
oO oO 
free, T-complete, and the rational price is 
Cr = E(Zrfr) = Ep, (fr): (7) 
By (4) and the self-similarity of Wiener processes, 
+ 
Ep (ST +K)?r= Ep, (So + uT +oWy — K) 
= Ep, (So -K+ oWr)* 
= E(So — K +o VT W1)". (8) 
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Note that if £ is a random variable with standard normal distribution “V (0, 1), 
then 


(a+ ber = [e + br) p(x) dr = ao(=) + p f e0 dz 
=ao(>) -o fale) = 08(5) +b9(=) (9) 


for a € R and b > 0. 
Setting 
a = S5 —K, b=ovT, 


we obtain required formula (5) from (7), (8), and (9). 


3. Now let 7 = (B, ¥) be a strategy (in the class of self-financing portfolios) of 
initial value xe = Cr that replicates the pay-off function fr, i.e., let X7 = fr 
(P-a.s.). 
By Chapter VII, §4b the value Xt = (XP cr of this strategy satisfies the 
relation . 
Xf = Ep (fr | Fi). (10) 


Since fr = (Sr — K)* and S = (S;)¢<7 is a Markov process, it follows that 
Xf = Eg ((Sr —K)*| Fe) 
= Ep, (((Se- K) + (Sr ~ $))* |S) 


= E(a+bé)t = ao(=) +bo(>), (11) 


where a = S; — K and b = oyT — t. 
For0 <t <T and S >Q we set 


S- K S—K 
C(t, S) = (S — K)® +tovVT —ty| —=—— ]. 12 
(,8)=(8~ pa A) toT) 0 
Then we see from (11) that Xř = C(t, St). Simultaneously, 
dX} = hdst. (13) 


By Itô’s formula for C(t, S;) we obtain 


dC(t, St) a ag et + 


ac ac 1 2PC) y 
at 2° 9S2 
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Comparing (14) and (13) and applying Corollary 1 to the Doob-Meyer decom- 
position (Chapter III, § 5b) we conclude that 


Wt = gg (ts St): (15) 


Differentiating the right-hand side of (12), after simple transformations we ob- 
tain 
s S&—K 
= | ——— }. 16 
n= 0( 25) (16) 


The suitable value of Bi can be found from the observation that 
C(t, St) = By + WS. (17) 


In other words, 
Bi = C(t, St) — HS- (18) 


In view of (12) and (16) we see that 


a GE). 09 


The following peculiarities of the behavior of y% and Br as t ¢ T are worth noting. 
Assume that close to the terminal instant T the stock prices S; are higher 
than K. Then we see from (16) and (19) that 


y 21 and Bi 3 -K (20) 
ast TT. On the other hand, if S; < K, then 
w> and ~& 0 (21) 


ast Î T. Both relations appear to be quite matter-of-course. For if S < K close 
to time T, then the pay-off function fy vanishes and, clearly, the capital xa equal 
to zero at time T is sufficient for the option writer, which is just the case if (21) 
holds. 

On the other hand, if S; > K close to time T, then fr = Sy — K, and the seller 
needs the capital XŽ = Sy — K. Since xe = Br +5, the required amount will 
be available if (20) holds, since X7 = ĝi + 4S; > Srp — K. 
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§ 1b. Black-Scholes Formula. Martingale Inference 


1. As already mentioned, the main deficiency of the linear Bachelier model 
St = So + pt + oW; (1) 
is that the prices S, can assume negative values. 


A more realistic model is that of a geometric (economic as some say, see [420]) 
Brownian motion, in which the prices can be expressed by the formula 


St = Spe", (2) 
where 3 
o 
H: = (n - > )t+om. (3) 
In other words, 
o2 
Sp = Spelt F)ttoWe, (4) 


Using Itô’s formula (Chapter III, § 3d) we see that 

dS; = Silu dt + odW;). 

This is often expressed symbolically as 

d 
Coe E awe 
St 
which emphasizes analogy with the formula 
ASp 
Sn—1 
which we used above, (e.g., in the Cox-Ross-Rubinstein model in the discrete-time 
case; see Chapter II, § 1e). 

The model of a geometric Brownian motion (2) was suggested by P. Samuel- 
son [420] in 1965; it underlies the Black-Merton-Scholes model and the famous 
Black-Scholes formula for the rational price of a standard European call option 
with pay-off function fr = (Sy — K)* discovered by F. Black and M. Scholes [44] 
and R. Merton [346] in 1973. 


2. Thus, we shall consider the Black-Merton-Scholes (B, S)-model and assume 
that the bank account B = (B¢)t>o evolves in accordance with the formula 


= H T OEn, 


dB, = rB, dt, (5) 
whereas the stock prices S = (5¢)¢50 are governed by a Brownian motion: 
dS; = Si(udt +o dW). (6) 
Thus, let 
Bı = Boe™, (7) 


o2 
Sp = Sel )ttowe, (8) 
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THEOREM (Black-Scholes formula). The rational price Cr = C(fr;P) of a stan- 
dard European call option with pay-off function fr = (Sp — K)t in the model 
(5)-(6) is described by the formula 


ee Ct) B eee 
Cr = So Ke"Tg| —* 2 ) 9 
ig p ( oVT oVT (9) 
In particular, for Sọ = K and r = 0 we have 

T 


IT 
and Cr ~ Ko a, °S T > 0 (cf. formula (6) in § la). 
T 


We present the proof of this formula, as borrowed from [44] and [346], in the 
next section. Here we suggest what one would call a ‘martingale’ proof; it is based 
upon our discussion in Chapter VII. 

Using notation similar to that in the preceding section we set 


-=r 1 —r\2 
Zr = exp( -> Wr- >(= )’r), (11) 


and let Pr be a measure on (Q, Fr) such that dP = Zr dPr. _ 
By Girsanov’s theorem (Chapter VII, §3b) the process W = (Wi)<r with 


a E z S z= 
Wi = Wi + B t is Wiener with respect to Py, therefore 
o 


T| Pr) 
T|Pr). 


Law (ut +oWi;t<T| Pr) = Law (rt + ow: t 
= Law(rt + oW;; t 


IN IN 


Hence 
~ o2 ~ 
Law (S3t<T|Pp) = Law (SoeF) itowe, t<T|Pr) 
o2 
= Law (sl neem, t<T| Pr). (12) 


From the theorem in Chapter VII, §4a we see that, taking the class of 0-ad- 
T 
missible strategies 7 = (8, y) with I 7282 du < œ (P-a.s.) we can describe the 
rational price Cp = C(fy;P) by the following formula: 


7 we TE 
Cr = Bo EETA (13) 
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Since fr = (Sr—K)* in this case, it follows in view of (12) and the self-similarity 
of the Wiener process (Law(Wr) = Law(VT W1)) that 


Cr = BE Ba = ett Ep (ST a K)* 
=e7"T Ex (Soet F)TtoWr kK)" 
T 

are. (Soe elr -g )THWr _ kK)" 

= enrt Ep, (So AG -2)T+oVTW) _ kK)" 

=e rT Ep, (Soe? - e- F THoVTW: ~ K\" 

pi 
=e "TE (ae — kK)", (14) 
where 
a=Sye"", b=oVT, E~.N(0,1). (15) 


It is an easy calculation (similar to (9) in § 1a) that 
2 4 ln & + 142 In 2 — 12 
E (ae — K)” = a(R 2") B ro(= 2), (16) 
Thus, it follows from (14)-(16) that 


l a 12 l a _ 12 
Cr = s(t ) Ket (AR aE). 


b b 


Setting here a = Soe”? and b = o yT we arrive at the Black-Scholes formula (9), 
which completes the proof. 


Remark 1. Setting 
In $% a T(r +5 ec) 
Y= > 
ovT 


we can write (9) in a more compact form: 


Cr = SoB(y4) — Ke“ B(y_). (17) 


Let Py be the rational price of the standard European put option with pay-off 
function fr = (K — Sr)*. Then, since 


Pr =Cr—Sy9+Ke-7t 
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(cf. ‘call-put parity’ identity (9) in Chapter VI, § 4d), it follows that 


In $9 4+T(r+ 3) 
Pr = —Syl|1—6( —* 2 )| 
7 o| ( oVT 
2 


(18) 


or 
Pr = —So®(—y4) + Ke "7 H(—y_). (19) 
3. The model in question is T-complete (see Definition 1 in Chapter VII, § 2d), 
and there exists a 0-admissible strategy 7 = (8,7) of value X7 = (Xf )i<r such 
that Xf = Cr and XF replicates fr faithfully: 
Xi = fr (P-as.). 
By the theorem in Chapter VII, § 4b, 


XE = Bie, (Br | Ft) = eT Ep, (Sr = KF) 


S + 

sa Ttg e Sea 

=e 7 Es, ((s 5) ) |s) 
o2 

=e Tat Eg, ( (Set a n = K)* | F1) 
o2 

=e P-OES ((Sel- F) T-D+0Wr -Wd _ K) | s) 


ag Ep (Sie 07 F) T-D+0(Wr-W») SRTA sı) 


2 
sg rT E( (Serna 2 K)* | St) 


2 
grt E( (ac = Ky | St), (20) 


where 
a=Se"T), b=oVT—-t, E~.N(0,1), 
and the variables S; and € = Wy — W; are independent with respect to the initial 
measure P. i 
Taking (16) into account, we see from (20) that the price C(t, S+) = Xf has the 


following expression: 
2 


ng +0 —9lr+ 5) 
ovyT -t 


— Kert-9( 5 RAT = oe Za | (21) 


C(t, St) = sa( 
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As in § la (see subsection 3) we can show that, for an optimal hedging portfolio 
T = (G,%)ecT, we have 
oc 
J = zat, St). 99 
Yt 79 | , t) ( ) 


By (21), after simple transformations we obtain 


s (PRAE 


= 23 
W% IT (23) 
(cf. formula (16) in § 1a), and since BB +75 = C(t, S+), it follows that 
> K In $ + (T — t) (r - È 
f= -gento IEN \(r 2) (24) 
Bo oVT -t 


It is easy to see that 0 < y% < 1 and Bt is always negative, which indicates 
borrowing from the bank account under the constraint - < By. 


0 
As with the Bachelier model, we have properties (20) and (21) in § 1a; namely, 
if t¢ T and St > K close to time T then 


WS =x St and BiB = —K; 


and 
if tt T and S; < K close to time T then 


75: 270 and BiB + 0. 


Remark 2. The above price C(t, St) depends, of course, also on the parameters r 
and o specifying the particular model. To indicate this dependence we shall write 
C= Cit, s,r,o) (with S; = s). 

It is often important in practice to have a knowledge of the ‘sensitivity’ of 
C(t, s,r,0) to variations of the parameters t, s, r, and o. The following functions 
are standard measures of this ‘sensitivity’ (see, e.g., [36] and [415]): 

OC Oc 0c OC 
= — A =>, = —, V = —. 

at’ ðs PT Or’ ðo 
(Here ‘V’ is pronounced ‘vega’.) 

For the Black-Scholes model, from (21) we see that 
soy(¥.(T — t)) 

2/T -t 
p= K(T — tje T-98(y-(T — t)), 


V = sp(yi(T - t)) VT -t, 


6 


8 = rKe "TOG (y_ (T -t)), 
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where 
1 —z?/2 
= ——e j 
WE) = ae 
2 
In $ T-t +5% 
we = nK +( )(r Zz) 
oVI-t 


4. Our calculations of Cy in (14)-(16) could be carried out in a somewhat different 
way, on the basis of an appropriate choice of a discounting process (‘numéraire’), 
as discussed in Chapter VII, § 1b. 

To this end we rewrite (13) with fr = (Sr — K)* as follows: 


E (Sp —K)t _ (Sr — K)t 

Cr = Bo 2 ee ake = Bo Ep, pl (Sr > K) 
= Bo Ex Sms > K)—Ke-"TEs I(Sp > K) (25) 
= SOR Br " aad l 


By (12), the calculation of Ep (Sr > K) encounters no complications: 


ln So + T(r — a) 
E~ I(Sr > K) = 6{ —“*—_~—_ ~~ ]. 26 
p, Isr > K)=0(= 821) (26) 
Sr : PF ; 
To calculate Bo Eb, pl (Sr > K) we consider the process Z = (Zt)¢c7 with 
T 
> . %/So 
Ot = Be] Bo ae 


It is important that Z is a positive martingale (with respect to the ‘martingale’ 
measure Pr) with Ep ZT = 1. Hence we can introduce another measure, Pr, by 
setting 
dPr = Zr dPr. (28) 
(The measure Pr is called in [434] the dual—to Pr—martingale measure.) 
By (7) and (8), 
Zi 2 coWst(u-r- Ft ae eo Ft 


Eos doj , ; f A 
where W; = Wi + ey (t < T) is a Wiener process with respect to Py. 
o 


Using Girsanov’s theorem (Chapter III, § 3e or Chapter VII, § 3b) it is easy to 
verify that 


W, = W; -ot (= w: + (4 -o)t), t<T, (29) 


oO 
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is a Wiener process with respect to Py. In view of the above notation, 
ST = 
Bo Eppe Or >K)=So Ep ZTI(ST > K)= So Ep (Sr > K) 
therefore it follows from (25) that 
-rT 
Cr = So Ep I (Sr > K) - Ke f E5 I(Sr > K). (30) 
By analogy with (12), 


Law(S;;t < T | Pr) = Law Soe lH- & )itoW:, t<ST | Pr) 


In particular, if € ~ M (0,1), then 
_ 2 
Law(S7|Pr) = Law (Sye("*)7 e7 VTE | Pr), 


Hence 


Si o? 
Ep [(Sr > K) = (=R tet el). (32) 


A combination of (30), (26), and (32) proves the Black-Scholes formula (9) for 
Cr in a different way, as promised at the beginning of the subsection. 


§1c. Black-Scholes Formula. 
Inference Based on the Solution of the Fundamental Equation 


1. We now present the original proof of the Black-Scholes formula for the rational 
price of option contracts, suggested independently by F. Black and M. Scholes 
in [44] and R. Merton in [346] (1973). 

Of course, the first question before the authors was about the definition of the 
rational price. Their (remarkable in simplicity and efficiency) idea was that this 
must be just the minimum level of capital allowing the option writer to build a 
hedging portfolio. 

More formally, this can be explained as follows. 

Consider a European option contract with maturity date T and pay-off func- 
tion fr. Then the rational (fair) price Y; of this contract at a time t,0 <t < T, 
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is (by the definition of F. Black, M. Scholes, and R. Merton) the price of a perfect 
European hedge 


Cyr] = inf{z: dn with Xf =a and XF = fr (P-a.s.)}. (1) 


(Cf. the appropriate definitions in Chapter VI, §1b and Chapter VII, §4b; we also 
denoted before Cjo r] by Cr-) 

In general, it is not a priori clear whether there exist perfect hedges. 

The results of Chapter VII, §§4a,b show that such hedges do exist in our 
model of a (B,S)-market and, moreover, Y, = Cy 7] is equal to the quantity 


B: EB. (£ | F), where Pr is the martingale measure. This is why we called 
E 


our scheme in § 1b a ‘martingale’ inference. 

Papers [44], [346] were written before the development of the ‘martingale’ ap- 
proach and used another method for the calculation of Y; = Cy 7], which we de- 
scribe now. 

Since both process S = (St)epo and pay-off function fr = (Sr — K)* are 
Markov, it is natural to assume that the ¥;-measurable variable Y; depends on the 
‘past’ only through Sg: 

¥y=Y(t, St). 


Assuming that the function Y = Y(t, S) on [0, T) x (0, oc) is, in addition, suffi- 
ciently smooth (more precisely, Y € Cl), the authors of [44] and [346] obtain the 
following fundamental equation: 

Y 2 
ð oY 1 292 ory 


—+rS += 


at ðs 2 age Y (2) 


with boundary condition 
Y(T,S)=(S—K)t. (3) 


(One can find a derivation of (2) in Chapter VII, § 4c; see equation (9) there.) 

The next step to the Black-Scholes formula (that is, to the expression for 
Y (0, So)) is to find a solution of (2)-(3). 

Equation (2) is of Feynman—Kac kind (see (19) in Chapter II, §3f) and it can 
be solved using the standard techniques developed for such equations. 

We consider the new variables 


@=0°(T —t), (4) 


z=ms+(r-F)(r-1 (5) 


and set 
V(6,Z) = e T-Y (t, 5). (6) 


1. European Options in Diffusion (B, S)-Stockmarkets 747 


In these variables (2)-(3) is equivalent to the following problem: 


OV 18V 
a6 20m2~% (7) 
V(0,Z) = (e7 — K)t. (8) 


Relation (7) is the heat equation, and by formula (17’) in Chapter III, §3f the 
solution to (7)-(8) can be expressed as follows: 


V(0,2Z) = E(eWet4 — K) F, (9) 
where W = (Wg) is a standard Wiener process. 
We set F 
a=e7t3, b=vð, and E~ M(0,1). 
Then 


E(eMEt? = R] TEE a aM kK)" 
= E(e7+3 vow, ~ K)” 
= E(ae -7 > kK)". (10) 
Using formula (16) in § 1b we see that 


Z—\InkK +8 Z—InZ 
E(eWet _ K\t = e249 ( ) Kol ). 11 
ie ) 7, 7 on 


Finally, using notation (4) and (5), by (6) and (11) we obtain the formula 


Y(t, S) =e7"T-9V@, Z) 


n$ + (T-t)(r+ Z) 
so( ovT -t ) 
~Kert-og( PEET -IE Y) al a a ). (12) 


(Cf. (21) in § 1b.) 
Setting here t = 0 and S = Sg, we obtain the required Black-Scholes formula 
(formula (9) in §1b). As shown in §1b, the portfolio 7 = (Be thier with y= 
Y ~ & z ‘ ; 
ai St), and A = Y (t, St) — 7 is a hedge of value XF. replicating faithfully the 


Os 
pay-off function fr = (Sp — K)T. 
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2. In conclusion we make several observations concerning the above two derivations 
of the Black-Scholes formula. 

The ‘martingale’ inference in § 1b is based on the existence of a unique martingale 
measure in the model of a (B, S)-market in question. This means the absence 
of arbitrage and enables one to calculate the rational price Cy by the formula 
Cr = Bo Ep, which (for fr = (Sr — K)t) gives one just the Black-Scholes 
formula. 

The approach based on the solution of the ‘fundamental equation’ brings one to 
the same formula. It is worth noting that the lack of arbitrage and the existence 
of perfect hedges are reflected there by the fact that, due to the unique solubility 
of (2)-(3), the resulting price Y (0, Sọ) is automatically ‘arbitrage-free’, ‘fair’: if the 
price asked for the option is lower than Y(0, Sg) then the seller cannot in general 
fulfill his obligations, while if it is higher than Y (0, Sọ) then the seller will for sure 
cash a net profit (‘have a free lunch’). See Chapter V, § 1b for detail. 


§1d. Black-Scholes Formula. Case with Dividends 


1. We assume again that a (B, S)-market can be described by relations (5) and (6) 
in §1b, but the stock also brings dividends (cf. Chapter V, § 1a.6). 

More precisely, this means the following. If S = (St)tz0 i is the stock market price 
then with dividends taken into account the capital S= (St)t>0 of the stockholder 
is assumed to evolve (after discounting) by the formula 


S; St OS; dt 
d| = d ——. 
(a-a) i 0) 
Here 6 > 0 is a parameter characterizing the rate of dividend payments. If 
B=1, fen it follows from (1) that 


dS; = dSz + 6S; dt, (2) 


so that the increase over time dt in the capital of the stockholder is the sum of the 
increase dS; in its market price and the dividends 65; dt proportional to S;. 
Since dS; = S(p dt + o dW;) and 


a ee 
a( $) = p (= r)at + oadw), (3) 


it follows by (1) that 


a 
aS 
JB 
N” 
Sic 


u-t) dt +o dW;). (4) 
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We set 5 
Wi = wi + ad (5) 
o 
and à 
= ports l/u-r+ô 
= T). 
Zr = op( -wr - (HO (6) 
Then, defining the measure Pr by the formula 
dPr = Zr dP, 


we see by Girsanov’s theorem (see Chapter ITI, § 3e) that W = (Wilter is a Wiener 
process with respect to Py. Hence 


Law (ut +owWi t< T|Pr) = Law ((r —d)tt+oWr: t<T 
= Law((r — 6)t +oWt; t < T|Pr) 


and 
= o2 
Law(Sy; t < T |Pr) = Law(Spe"? FIOM; t< T| Pr). (7) 
Let XF = By + nSt, t < T, be the value of a self-financing strategy 7 = 


: : Xf . i ‘ 5 
(3,7). Since the discounted capital (=) is a martingale with respect to Py, 
ts 


ai ; ; T og i 
m belongs to the class of 0-admissible strategies with if 7252 du < œœ (P-a.s.), it 
follows that 


Hence (cf. (13) in §1b) we obtain that the rational price Cy(d;r) of a call option 
is expressed by the formula 


Cr(ô;r) = Pos, $E, (8) 


where fr = (Sy — K)*. 
In view of (7) and formula (16) in §1b, we see from (8) that 


2 F 
Cr(ð;r) = ert” Eee (sete IT +0Wr ~ K) 


+ 
as (ye = K) 


+ 
z ee: Came rea A K) 
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Now let Py(é;r) be the corresponding price of a put option in the case of div- 
idend payments. It is easy to see that Cy(ô;r) and Py(6;r) are connected by the 
following identity of ‘call-put parity’ (cf. (9) in Chapter VI, § 4d): 


Pr(d;r) = Cp(6;r) — Spe T + Ke7’?. (10) 


Comparing (9) and formula (9) in §1b for Cy(0;r) (= Cr) and taking (10) into 
account we arrive at the following conclusion. 


THEOREM. The rational prices C7(6;r) and Py(6;r) of call and put options in the 
case of dividend payments are described by the formulas 


Cr(8;r) = eT Cp(0;r — ô)| (11) 


and 


Pr(0;r) =e *Pr(0;r — ô) | (12) 


where Cr(0;r — 6) and Pr(0;r — ô) can be defined by the right-hand sides of (9) 
and formula (18) in §1b (‘the case without dividends’) with r replaced by r — 6. 


2. American Options in Diffusion (B, S)-Stockmarkets. 
Case of an Infinite Time Horizon 


§ 2a. Standard Call Option 


1. In the considerations of options and other derivative financial instruments one 
must sharply distinguish between two cases: of the time parameter t ranging over 
a finite interval [0,7], and of t in the infinite interval [0, 00). Of course, the second 
case smacks of idealization, but it is much easier to study than the first case, when 
the decisions taken at time t depend significantly on the time T — t remaining till 
the expiration of the contract. 

This explains why we start with the discussion of the second case. We consider 
finite intervals [0, T] in § 3. 
2. We assume that we have a standard Wiener process W = (W;)¢0 on a filtered 
probability space (Q, F, (Ft)tz0, P) and that our diffusion (B, S)-market has the 
following structure: 


dB = rBdt, Bo >0, (1) 
dSı = Si(udt +o dW), So >00. (2) 


For a standard discounted call option the pay-off function has, by definition, the 
following structure: 
fe = e™™g(S:), (3) 
where g(x) = (x — K)*, z € E = (0, œ), à > 0. 
By analogy with the discrete-time case we set 


V*(x) = sup Bo eee (4) 
B- 
where the supremum is taken over the class of finite stopping times 


MF = {r = T(w): 0< r(w) < %, w ERN}, (5) 
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and Ez is the expectation with respect to the martingale measure Pz such that the 
process S = (S:)¢>9 has a stochastic differential 
dS; = Sifr dt + o dW), So =z (6) 


with respect to this measure. 
To simplify the notation we assume from the very beginning that # = r. Making 


this assumption we can drop the sign ‘~’ in our notation for Pz and Ez. 
Thus, let 
V*(z) = sup Eget) (S, — K)*. (7) 
TEMPO 


It is reasonable in many problems to consider. alongside MẸ, also the class 
Mo = {r = r(w): 0< rw) < %, we N} 


of Markov times that can also assume the value +oo and to set 
V*(c) = sup Eze™T®t)T(S, — K)tI(r < 00). (8) 
TEMG 

Finding V*(x) and V“(z) is in direct relation to the calculations for standard 
American call options because the values of V*(r) and V*(x) are just the rational 
prices, provided that the buyer can choose the exercise time in the class MẸ? or T° 
and So = x. (The case of r = œ corresponds to ducking the exercise of the option.) 
The proof of this assertion in the discrete-time case can be carried out in the same 
way as the proof of Theorem 1 in Chapter VI, § 2c. The changes in the continuous- 
time case are not very essential: see, for instance, [33], [265], or [281] for greater 
detail. Moreover, if r* and 7* are optimal times delivering the solutions to (7) 
and (8), respectively, then they are also the optimal strike times (in the classes 
M and Mo )- 
3. Embarking on the discussion of optimal stopping problems (7) and (8) we single 
out the (noninteresting) case of A = 0 first. 

In that case 


2 
e "(Sy = K)" = (Soe F* — B, 


which shows that the process (e~"'(S; — K)*)t>o is a submartingale and therefore 
if r EM, i.e., rw) < T for w € 9, then 


Ere” (Sr — K)" < Ere 71 (Sp — K)*+ <z. (9) 
By the Black-Scholes formula (see (9) in § 1b), 
Eze "(Sp —-K)+ +2 as T 30, (10) 


for each r > 0. 
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Since V*(z) = jim V7 (x) in our case, where 
0° 
Vi(c) = sup Ege~At7(5, — K)+ (11) 
TEMT 


(see [441; Chapter 3] and cf. Chapter VI, § 5b), it follows by (9) and (10) that if 
à = 0 and r > 0, then ‘the observations must be continued as long as possible’. 
More precisely, for each z > 0 and each € > 0 there exists a deterministic instant 
Tr e such that 

Ese "Tee (Sp, -K)+ >a-e 


and Tr, + oo ase > 0. 


4. We formulate now the main results obtained for the optimal stopping prob- 
lems (7) and (8) in the case A > 0. 


THEOREM. If A > 0, then for each x € (0,00) we have 


an eta, «2 < a* 
V*(x) = V* (£) = i : 12 
On. ane meee (12) 
where 

1 2r 1 r2 WA+r) 

S (5-3) Ga) oe” vs) 
Pe cea ea 

*=y (45) (14) 
ct=K = ; (15) 


There exists an optirnal stopping time in the class Mo, namely, the time 


T* = inf{t 20: S; >2*}. (16) 
Moreover, 
2 
1 ifr > > ors > a*, 
P.(t* < œ) = 3 j (17) 
ENIA y o 
(=) a ifr < — and qz < a*. 
br 2 


We present two proofs of these results below. The first is based on the ‘Mar- 
kovian’ approach to optimal stopping problems and is conceptually similar to the 
proof in the discrete-time case (see Chapter VI, §5b). The second is based on 
some ‘martingale’ ideas used in [32] and on the transition to the ‘dual’ probability 
measure (see § 1b.4). 
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5. The first proof. We consider optimal stopping problems that are slightly more 
general than (7) and (8). 


Let 
V*(c) = sup Eze 7 9(S;), (18) 
TEMPO 
V*(z) = sup Eze 97g(S,)I(t < co) (19) 
TEMG 


be the prices in the optimal stopping problem for the Markov process S=(S}, Ft, Px), 
x € E = (0,00), where Px is the probability distribution for the process S with 
So = x, 8 > 0, and g = g(x) is some Borel function. 

If g = g(x) is nonnegative and continuous, then the general theory of optimal 
stopping rules for Markov processes (see [441; Chapter 3] and cf. Theorem 4 in 
Chapter VI, § 2a) says that: 

(a) V*(z) = V* (£), z € E; (20) 

(b) V* (x) is the smallest 3-excessive majorant of g(x), i.e., the smallest function 
V(x) such that 


V(x) > g(t) and V(r) >e lV (a), (21) 
where TV (x) = Ez V (St); 
(©) V*(2) = lim lim QN 9(2), (22) 
where E 
Qng() = max(g(x£), eP?" Tz-ng(2)); (23) 


(d) if Ex |sup eFtg(Si)| < œ, then for each € > 0 the instant 
t 


Te = inf {t: V* (S+) < e F4g(S4) +e} (24) 


is an ¢-optimal stopping time in the class MB, i.e., Pe(Te < œ) = 1, x € E, and 
V*(x) —e < Ere 9(S;.); 
(e) if 
To = inf {t: V*(St) < e gS) 


is an optimal stopping time (Pz(7 < co) = 1, x € E), then it is optimal in the 
class MG: 
V*(x)= Eze 9 g(Sm), zE E; 


moreover, if 7; is another optimal stopping time, then Pg(To < 71) = 1, x € E, ie., 
To is the smallest optimal stopping time. 

Let C* = {x € E: V*(x) > g(x)} and let D* = {x € E: V*(z) = g(x) }. 

It is easy to see from (22) and (23) (cf. Chapter VI, § 5b) that the structure of 
V* = V*(æ) is rather simple: this is a downwards conver function on E = (0, œ) 
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majorizing g = g(x). In addition, there exists z* such that C* = {x:  < «*} and 
D* =fr >n"*}. 

Hence the solution of the problem (7) and (8) is reduced to finding «* and, of 
course, the function V*(r) (= V* (£)). 

Analyzing the arguments used in the corresponding discrete-time problem in 
Chapter VI, §5b.6, one easily understands that the required value of z* and the 
function V*(x), the smallest (À + r)-excessive majorant of g(x), must be solutions 
of the following Stephan, or free-boundary, problem (see [441; 3.8]): 


LV(z)=(A+r)V(a), 2 <&, (25) 
V(x) =g(z), z>7, (26) 
Vaj aw) (27) 
dz |e de larg 
where o? 82 
Lerat > 2 a5 (28) 


is the infinitesimal operator (see [126]) of the process S = (St)t>0 with stochastic 
differential 
dS; = St(r dt + o dW). 


We shall seek a solution of (25) (in the so far unknown domain (0,2)) in the 
following form: 
V(x) = cxl. (29) 


Then we obtain the following equation for y: 


y? ( a aE as (30) 


To simplify the notation we assume that o? = 1. (1 o? #1, then one must 


r à., 
make the change r > —5, À > — in the answers. ) 
oO oO 


2 
Equation (30) with ø? = 1 has two roots, 
1 1 
EE pe 2 
n=(5 r)+y( r) +2(A+r) (31) 
and 
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Since A > 0, it follows that y1 > 1. (fA = 0, then y1 = 1.) The root y2 is negative. 
Hence the general solution of (25) is as follows: 


V(x) = qr + cg? (33) 


As in the discrete-time case (Chapter VI, § 5b), we see from (33) that c2 = 0 
since otherwise V(x) > co as x | 0, which is impossible in our context (we must 
have V* (x) > 0 and V*(z) < 2). 

Thus, V(x) = cya"! for x < £, where cı and the ‘free’ boundary @ are the 
unknowns. To find them we use condition (26) and condition (27) of ‘smooth 
pasting’. 

Condition (26) shows that 


ar =r- K. (34) 
Condition (27) takes the form 
ayz” =]. (35) 


We see from these two relations that 


2 z eyort 
ake a=” ) l (36) 


yl’ K 
Thus, the solution V(x) to (25)-(27) can be represented as follows: 


~ { cet, “<a, 


V(r) = (37) 


x—-K, x£2>%, 
where 7 and cy are defined by (36). 


Remark. If K = 1, then V(x) is just the function V(x) defined by (39) in Chap- 
ter VI, § 5b, which is not that surprising if we take into account (22) and our method 
of finding V(x) in Chapter VI. 

We shall have proved the theorem once we shall have shown that the function 
V (z) so obtained is the price V*(«) (see (7)) and the time 


7 = inf{t > 0: S% >a} 


is optimal in the class Mg (and in the class MES if P(T < oo) = 1). 
To this end it is obviously suffices to use the following test: for x € E = (0,00) 
we must_have 
(A) V(x) = Ege Ot)" (Sz — K)HI(F < 00) 
and è 
(B) V(x) > Ege7O4")7 (S, — K)t+ I(r < 00) for each r € Mo. 
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Further, since (Sz — K)+I(F < œ) = V(S)I(F < œ) and V(x) > (z -— K)*, 
for (A) and (B) we must verify the following conditions: 

(A’) V(x) = Ege" OTT V (Sz) I(F < 00) 
and 

(B’) V (£) > Exe“ O+)7V(S,)I(r < 00) for each T € Mo and z € E. 

Usually, the verification of (A’) and (B’) is based on Itô’s formula for V(x) (more 
precisely, we use its generalization, the [t6-—Meyer formula). It proceeds as follows. 

Let V = V(x) be a function in the class C?, i.e., a function with continuous 
second derivative. Then the ‘classical’ Ité’s formula (Chapter III, §5c) for the 
function F(t,c) = e7 +)tV (£) and the process S = (St)tz0 brings one to the 
following representation: 


e  tNIEV (54) == V(Sp) + T eT Atr)u [LV (Su) — (À+ r)V(Su)] du 
0 


t 
4 J eTOtr ugg V’ (Su) Wy. (38) 
0 


Considering now the function V (£) defined in (37) we can observe that it is 
in the class C? for « € E = (0,00) outside one point x = F. so that one can 
anticipate (38) also for V(x) = V(x) if one chooses a suitable interpretation of the 
derivative at x = T. 2 

In our case V(x) is (downwards) convex and its first derivative V’(x) is well 
defined and continuous for all s € E = (0,00); its second derivative V" (x) is 
defined for z € E = (0,00) distinct from x, where we have the well-defined limits 


V” (E) =limV"(c) and Vi{(%) = lim V” (2). 
ate LE 


There exists a generalization of It6’s formula in stochastic calculus obtained by 
P.-A. Meyer for a function V(x) that is a difference of two convex functions. (See, 
e.g. [248; (5.52)| or the [t6-Meyer formula in [395; IV].) 

Our function V(x) is (downwards) conver and for F(t,£) = e OF) (x) and 
S = (St)tp0 we have the Itô-Meyer formula, which looks the same as (38), but 
where V” (F) is replaced by, say, V” (£). 

Having agreed about this we obtain 


eTA ($4) — V (So) = f eOr )u [LV (Su) — (A +r) (Su)] du + Mi, (39) 
0 


where 


t ~ 
M; = f eTO) uo Su V’ (Su) dWy- (40) 
0 
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It is worth noting that 


LV(z) —(A+r)V(z) =0 


(41) 


for x < T (by (25)), and it is a straightforward calculation that this equality holds 


also for x = T, while for © > Z we have 
LV (£) — (A +r)V(z) < 0. 
By (39), (41), and (42) we obtain (for Sọ = £) 


V(x) > eTO (54) — My. 


(42) 


(43) 


As is clear from (40), the process M = (Mz)ty0 is a local martingale. Let (Tn) 


be some localizing sequence and let r € Mo . Then by (43) we obtain 
V(x) > Ege OHA (Sar) — EMz ar 
== Epe T AHAT) Sr ar) 
= Ege TAHAT) T(S ar)I (r < co) 
and, by Fatou’s lemma, 


V(z) > lim Epe TOH ^T) (Sr ar)I (T < 00) 


n 


> Ege ATTY (S,)I(T < 00), 


which proves (B’). 
We now claim (A’). 


If € D = {x: x > &}, then P,(F = 0) = 1, and property (A’) is obvious. 


Now let ¥ € Č = {x: £< T}. Then by (41) and (39) we obtain 


V(x) =e OPC (5, nz) — Maat 


n 


so that 


V (2) = Ese OHANA, ar) 
= Ege OIC V6. aF (F < o0) 


+ Ege T AHMAD (S, I (F = 00). 


(44) 
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Since 
Dig p ARIAN A S ai < 00) 
< sup[eWOF)*V (84) ]1(F < œ) 
t<T 
< sup| SA ONNE < 00) 
t<T 
< sup [emtee W- t] 
t20 
and 5 
Esup[e Tte Wt- t] < oo (45) 
t20 


(see Corollary 2 to Lemma 1 below), it follows from Lebesgue’s dominated conver- 
gence theorem that 


lim Epe MUAY (5, a) I(F < 00) = Exe OHV (Sz)I(F < 00). (46) 


Further, V (S7,) < V() < œ on the set {w: F = co}, therefore 
lim Ere Ot mV (S, )I(F = co) = 0. (47) 


Required property (A’) is a consequence of (44), (46), and (47). 
To complete the proof of the theorem we must verify property (46) and prove (17) 
(with 7* = 7). We shall establish the following result to this end. 


LEMMA I. For z > 0, u € R, and o > 0 we have 


=, x — pt E Zeg —x— ut 
~~. =H e (=E (48) 
-5 
where ®(x =z 2 


Proof. For simplicity, let o? = 1. By Girsanov’s theorem (Chapter III, §3e or 
Chapter VII, § 3b) 


P(max(ns +W.) >a, ut +W: <S £) 
= E7 (max(us + Ws) >x, ut + Wi < x) 


2 
= afte 
= Eexp( uw; 5 ) I (max W,>2,Wi< g): (49) 
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We set Te = inf{t > 0: W: = x}. Then, by D. André’s reflection principle 
Wi = Wil (t < Tr) + (Qc — Wi)I(t > Tr) (50) 


is also a Wiener process (see Chapter III, § 3b, and also [124], [266], and [439]). 
By (49) and (50) we obtain 


P(max(us +Ws)< z) 


= P(ut + Wi <x) - P(max(us + Ws) >x, yt +Ws x) 
8s 


= (2 =) — Eexp( uw: — et) I (max Ws >x, W <S z) 
= o( ==) — E exp ( ui, — 1) I (max Ws >T, Wt < c) 
> a(2#) a Eexp( uQ — W) - Ka) I(Wi > 2) 

= o(2#) - Bex (ul — Ee) > x) 

= (=+) — eM" P(t + Wi > Lr) 

=o) 208) 


The proof is complete. 


COROLLARY 1. If u <0, then 


2 
P(sup (ut +aWi) <S £) =]1— exp] er}. (51) 
t20 o 
If u > 0, then 
P (sup(ut +oWi) <S z) = 0. (51) 
t20 


2 
COROLLARY 2 (to the proof of (45)). Setting u = — (a + T) in (50) we obtain 


2 
2 
2A 
P (sup oW, — a+ je <z =1-em{-(14 2 Lp. (52) 
t>0 2 o 
Hence if A > 0, then (45) holds. 
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2 
oW- St 


COROLLARY 3 (to the proof of (17)). Let Sy = re”? -e and assume that 
2 


x* >x. Then one sees from (51) with u = r — Z < 0 that 


o? Le z \ 1732 
P(7* = = oS < = 
(T oo) P(sup low + (- ; J| <ln ) ( -) ; (53) 


2 
which proves (17) forr < Z and x < z*. This formula is obvious for £ > £ 
2 
for x < x* and w=r— = > 0 formula (17) is a consequence of (51’). 


* while 


All this completes the first proof of the theorem. 


6. The second proof. Let 8 = 4 +r, assume that » > 0, let yı be as in (31), 
and let Sọ = 1. 
Setting 
Tae Og (54) 


we obtain 


2 
Zr= epf nol — mer. (55) 


Hence Z = (Z;) is a P-martingale and 
ets, — K)+ = S7” (Si — K)* Ze. 


Setting, in addition, 
Ge) =s" (s - K)t, 


we see that 


V*(1)= sup Eet) (S, — K)+I(r < 00) 
TEMG 
= sup EG(S,)Z,I(T < œ). (56) 
TEMG 


The process S = (S;)¢>0 under consideration is generated by a Wiener process 
W = (W:)t>0; without loss of generality we can assume that (Q, ¥,(Ft)+>0,P) is 
a coordinate Wiener filtered space, i.e., Q = C[0,00) is the space of continuous 
functions w = (w(t))is0, Ft = o(w: w(s),5 < t), F = V Fz, and P is the Wiener 
measure. Se _ 

Let P be a measure in (Q, F) such that the process W = (W;)t>0 with 


Wi = Wi — (y10)t is a Wiener process with respect to P. 
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If Py = P| Fi and P,=P | Fz are the restrictions of P and P to Fi, then Py ~ Pi, 
and the Radon—Nikodym derivative is 


Z = Zy (57) 


where Z; is defined by (55). (See, for instance, Theorem 2 in Chapter III, § 3e.) 
Hence, if A € ¥;, then = 
El, = EIA Zt, 


where E is averaging with respect to P, and if A € Fr, then 
Elaliezgy = EZrIAlr<o) (58) 
(cf. formula (2) in Chapter V, § 3a). 
Hence we obtain that if f = f (w) is a nonnegative Fr-measurable function, then 
Ef Ir<o) Ss EZr fl (7 <00): (59) 
Taken together with (56), this shows that 
V*(1)= sup EG(S;)I(r < oo). (60) 
TEM 


In other words, the optimal stopping problem (8) is equivalent (for « = 1) to 
another problem, (60), which can be easily solved by the following arguments. We 
consider the function G(x) = s7% (x — K)t. This function attains its maximum 


on E = (0,00) at the point z* = K —21— (cf. (15)), and 
Y 


1—1 
G(x) =c* (=G(zx"*)), 61 
max aya, (= G(z*)) (61) 
where c* is defined by (14). Hence by (60) we obtain 
V*(1) <c* sup El(r< oo) < c*. (62) 
TEMG 


Let 7* = inf{t > 0: S > x*} and let the initial value So be equal to 1 < z*. 
Since À > 0 by assumption, it follows that z* < oo. 


LEMMA 2. 1) For \ > 0 we have 


P(7* < co) =1. (63) 


P(T* < œ)=1. (64) 
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Proof. With respect to P, the process W = W; — (y10)t (t > 0) is Wiener, and by 
Girsanov’s theorem 


= fe 2 
P(t* < œ) = P (max St > z] = P( max ow. + (- — T) > ma) 
t20 2 2 
N z o2 
= P (max [om + (no? +r— T) > ma") 
t20 2 
= go? 
= P (max ow, + (no +r— T) > ina" ) =1, 


where the last equality follows from (51) and the relation 
2 
2 eee et eee see ee 
eee ae ee VG A) tee 

Thus, we have proved (63). Property (64) was established by Corollary 3. This 
completes the proof of Lemma 2. 

We return to (62). Since P(T* < œ) = 1 and G(S;+) = G(x*) = c*, it follows 
that 


EG(S;+) = ct 
and by (62) we obtain 
V*(1) = EG(S;+*) = EG(S;+)I(7* < œ) 
= Ee“ Otr)" (5,4 — K)I(1* < œ) = c*. 
This gives us the second proof of formula (12) (for x = 1) and shows that 7* is 


; o TE 
an optimal stopping time. If r > —, then P(T* < œ) = 1, and therefore +* is in 


this case an optimal stopping time in the class MGS - 


§ 2b. Standard Put Option 


1. We can consider put options with pay-off functions fe = e—Atg (St), where 
g(x) = (K —2x)t, x € E = (0,00), in the same way as we have considered call 
options. For this reason we restrict ourselves to the statements of results and the 
main points of their proofs. 

We shall consider a diffusion (B, S)-market described by the representations (1) 
and (2) in § 2a. Let 


U,(x) = sup Eze +7 (K — S,)*, (1) 
TEMG? 
U,(z) = sup Eget) (K —S8,)t+I(r < co). (2) 


TEMG 
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THEOREM. Assume that \ > 0. Then 


CyLI2, T > Le, 


u.) =D.) = { 


K-z, TXT, 


where 


1+2] 
= jail (45) ; 
Cx = + 
BA 
t= K E 6 
“ 1+ |y2| 
There exists an optimal stopping time in the class Mo, namely, 
Ta = inf {t > 0: St S r4}. (7) 
Moreover, 
o2 
1 ifr < y rT S Ta, 
Pa(Tą < œ) = P ey 2 (8) 
— , ifr > > and T > ay. 
x 2 


This result is even slightly more simple than the theorem in § 2a: the function 
g(x) = (K —x)t is now bounded. 

By analogy with the corresponding discrete-time problem (Chapter VI, § 5c) it 
would be natural to assume that the domains of continued observations C and the 
stopping domain D, have the following form: 


Cy = {x € E: £ > z4} = {x € E: U,(2) > g(x)} 


and 
D, = {x EEB:r Sa} = {a € E: U,(x) = g(£)}, 


where x, and U,(z) are equal to the solutions (£ and U(c)) of the Stephan problem 


LU (2) =(A+r)U(z), 2 >&, (9) 
Ū(s)=g(2) “285, (10) 
dU(x)|  _ dg(2) 

dg PAR: ~ dg eee (11) 
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In our case the bounded solutions of (9) have the form U (z) = ce” for x > &, 
where y2 is the negative root of the square equation (30) in § 2a given by (4). 
Using (10) and (11), we can find the values of ¢ and 7, which are expressed by the 
right-hand sides of (5) and (6). 

We can prove the equality U, (£) = Ü (z) and the optimality of 7 by testing the 
conditions (A) and (B), using arguments similar to the ones presented in § 2a. The 
proof of (8) is based on Lemma 1 in § 2a and the corollaries to it. 


Remark. The ‘martingale’ (i.e., the second) proof of the above theorem is based on 
the observation that in our case the process Z = (Z+)t>0 with 
Z=e ts? p=rA+r, 


is a martingale and 
(120)? 


Z = exp vzo W — fer it. 


Hence 
ePt(K — $4)+ = S$, 1(K — Si)" Zt S Zt, 


and we can complete as in the case of call options (§ 2a). 


§2c. Combinations of Put and Call Options 


1. In practice, as mentioned in Chapter VI, § 4e, alongside various kinds of options, 
one often encounters their combinations. One example here can be a strangle option, 
a combination of call and put options with different strike prices. 

In this section we present a calculation for an American strangle option, making 
again the assumption that it can be exercised at arbitrary time on [0,00) and the 
structure of the (B,S)-market is as described by relations (1)—(2) in § 2a. 

In other words, we assume that 


B= Boe" (1) 
and 


2 
S: = vexo owi + (u= F)e}, (2) 


where W = (W+)t>zo is a standard Wiener process and u = r. The original mea- 
sure P is martingale in this case. 
For a discounted strangle option the pay-off function is as follows (cf. (3) in § 2a): 
fe =e7™g(S:),  t>0, 


where 
Kı-s, s< Kj, 


g(s) = 4 0, Kı <s< Ka, (3) 
s— Ko, s> Ko. 
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In accordance with the general theory (Chapter VII, §4 and Chapter VI, §2), 
the price is 


V*(c)= sup BoEs = sup Eze~P+")"9(S,), (4) 
TEMG T TEMG? 
where ME = {r = T(w): 0 < tw) < co, w € Q} is the class of finite stopping 
times and E, is averaging under the assumption that So = z € E = (0,00). 
2. To determine the price V*(x) and the corresponding optimal stopping time we 
use the ‘martingale’ trick from [32], which we have already used in the preceding 


sections (e.g., in the ‘second proof’ in § 2a.6). We shall assume here that Sg = 1 
and the strike prices Kı and Ko satisfy the inequality Kı < 1 < Kə. 


Let 
n=(5-3)+¥G-3) +. 6) 


1 or 1 r2 WrA4+7r) 
=G- VG- a 7 (6) 
be the roots of the square equation (30) in § 2a. 
As shown in §§2a,b, the processes MO = ets” and MP are UEG 2: 
t > 0, with 6 = ’+r are P-martingales. Hence the nonnegative process 
Mi (p) = pM? +(1- p)M,”) is a P-martingale for each 0 < p < 1, and 


V*(1) = sup Eje TOt) 9(S,) 


TEMG? 

g(S-+) 
= sup EM. : T 
rene WER T AST ) 


Proceeding as in § 2a.6 we consider measures P(p) such that 


dP +(p) 
= Mlp). 
dP; t(p) (8) 
Then we conclude from (7) that 
V*(1) = sup Ex — AWS) (9) 


remg PO) ps7 + (1—p)SP?’ 


where is averaging with respect to P(p). 


E= 
P(p) 
The next step is to choose a suitable value of p in [0,1] (which we denote by p* 
in what follows) such that we can solve the corresponding optimal stopping prob- 


lem (9). 
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As shown in [32], the following system of equations for (p, s1, s2): 


82 — Ke i Ky -sı (10) 
ps3 +(1—p)s?? ps]? + (1—p)s}?’ 
$20 pris’ + (1 — p)y2s3? (11) 
89 — K2 psy + (1—p)s? 
CA prsi +(1-— pys? (12) 
s- Ky ps? +(1-p)s? 


where p € [0, 1], s2 > K2, and sı < Kj, has a unique solution (p*, sł, s3). 
Let 


á s3 — Ko ( K2 — sï ) 
C = brani . 
p* (s3) + (1 — p*)(s3)? p* (si) + (1 — p*) (st)? 


A simple analysis shows that 


g(s) g(s) rs 
S —— Z SU S S O = a 
sol P'S + (L— ps get prs + (1—p*)s? Ee) 


s i ; 
g(s) 5 takes its maximum at some 


In addition, the function G(s) = prsni +(1—p*)sv 


points sj < Kı and s3 > K2. 
Hence, by (9) we obtain 
V*(1) > č. (13) 


We set 
T* = inf{t: S; = sf or St = s3}. 
By the properties of a Brownian motion with drift, P(r* < co) = 1. Hence 
Epp GlSz*) = c*, and therefore 
V*(1)= č 
and 7* is the optimal stopping time. 


Remark. If Kı = Kə, then the strangle option becomes a straddle option (see 
Chapter VI, § 4e.2). 


§ 2d. Russian Option 
1. We consider a diffusion (B, S)-market with 
dB, = r Bidt, Bo > 0, (1) 


and 
dS; = St (r dt + o dW), So > 0, (2) 
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or, equivalently, 
t 
Bi = Boe” 7 


2 
S; = Soe” ge iS? 


Since 3 
C e A (3) 
B; Bo 
S St ; í i N 
the process -= = | = is a martingale with respect to the original measure P. 
B Bt) 40 
Let 
fr=e“g(S),  t>0, (4) 
where 
+ 
o(S) = (max Sa — aSt) > a0. (5) 
ust 


American options with pay-off functions (4) (which are called Russian options 
[434], [435]) belong to the class of put options with aftereffect and discounting. 
(Cf. Chapter VI, § 5d.) 

Using the same notation (Ex, NZ, Wo ,...) as in §§ 2a, b we set 


U,(z) = sup Eze ™ f;(S) (6) 
TEMG? 
and 
U,(x) = sup Eze" fr(S)I(T < œ). (7) 
TEMG 


By contrast to ‘one-dimensional’ optimal stopping problems for a Markov pro- 
cess S = (St)t>0 considered in §§ 2a, b, the problems (6) and (7) are ‘two-dimen- 
sional’ in the following sense: the functionals f,($) depend on a two-dimensional 
Markov process (S, max Su}: 

uxt 


It is remarkable, however, that using the methods of ‘change of measure’ one 
can reduce these ‘two-dimensional’ problems to ‘one-dimensional’ ones, which makes 
it possible to find explicit expressions for U,(a) (= U,(x)) and optimal stopping 
times. 


2. We have already explained rather explicitly the idea of this reduction of ‘two- 
dimensional’ problems to ‘one-dimensional’ in Chapter VI, § 5d, for the discrete-time 
case. 

Here we proceed as in §2a.5 and assume that (Q, F, (F), P) is the coordinate 
Wiener space. 
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Let P be a measure in (Q, F) such that its restriction P; is equivalent to P for 
each ¢ and 


-z 8 
a e (8) 
where ; 54/8 
Baer St |= SO t > 0. 
paea (=), (9) 
The process W= (Wi)t>0 with 
W: = W — ot (10) 


is Wiener with respect to P by Girsanov’s theorem, and for 7 € Mo we have 


-ar 5r/So gr(S) 


=(NFr)T I = zE I(r < 
Eze gr(S)I(T < co) = zEze BJB) S, (T < co) 
fe + 
4 rE,e >" Zr (maxucs Di GPa) aos) I(r < 00) 
S, 
+ 
= xEe~*" [me — | I(r < œ). (11) 
a 
We consider now the process (#)t>0 of the variables 
p= mantmatuct Su, Soyo) ; (12) 
t 
where wo > 1. 
Clearly, if wo = 1, then 
pi = = Su ; (13) 
t 
and therefore 
E Ar | MaXu<t Su a 2 aXe + 
Ee e ee I(t < œ) = Ee" [yr — al" I(T < œ). (14) 
T 


Let Py be the probability distribution of the process (Yt)+>0 under the assump- 
tion Yọ = 4% > 1; we consider the following optimal stopping problems: 


Û(y)= sup Eye >? [pr — alt (15) 


TEMG 
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and 
U(w) = sup Eye >” [br —alt I(r < œ), (16) 
TEMG 
which can be regarded as pricing problems for discounted American call option with 
process of stock prices (W)t>0 and B = 1. (We point out that our initial problem 
relates to put options!) 
By (6), (7), and (15), (16) we obtain (in view of (11) and (14)) the equality 


U,(z)=20(1),  U,(@) = 2U(). (17) 


3. Before we state the main results on the solution of the optimal stopping prob- 
lems (15) and (16) we discuss some properties of the process (Yt)t>0- 


LEMMA. 1) The process (Wt)t>0 is a diffusion Markov process on the phase space 
E = [1,œ) with respect to the measure P, with instantaneous reflection at the 
point {1}. 

2) The process (¥4)t>0 has the stochastic differential 


diy = —d4(r dt +o dW2) + dee, (18) 


where (¥+)t>0 is a nondecreasing process increasing on the set {(w, t): yi(w) = 1}, 
and W = (W2)i>0 is a Wiener process (with respect to P). 

3) Ifq = q(w) is a function on E = [1,00) such that q € C? on (1,00) and there 
exists q/(1+) = lim q' (W), then 


dq o? » dq 
L= -rot a vod (19) 
and 
q (i+) = 0. (20) 
Proof. By (12), 
S 
YA = max{ este =, Soro } 
t+A t+A 
a xf MaXu <t Sy Sowvo MaXtcuct+A Su/ St } 
St- Sta /St? St Sta / St St a/St 
1 Maxtcuct+A Su/St \ 
= maxd yr: ; = A 21 
{ t Sipa St Stya/St a) 
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Note that fort < u < t + A we have 


Be oloi We) 4 (« | ae a}. 


Hence, taking into account that W is a Wiener process with respect to P, we obtain 
the following ‘Markovian’ property: 


Law (Yt+A l Ft, P) = Law (Yt+A lve, P). 


To deduce (18) we set 
N, = max{max Sis Sovo}. (22) 


Clearly, N = (Nz)eso is a nondecreasing process of bounded variation. 
By (2) and (10), 


dS, = St[(r +07) dt +o dW] (23) 
and A 7 
22 yae ay Wil. 
a(x) 3, [rdt + o dW] (24) 
Hence, by It’s formula 
1 1 =~ dN; 
= — > dN; = -— dt — 
dw maf =) +5 aN pe [r dt + o dW] + 5’ (25) 
or, in the integral form, 
t t = t dN 
te=vo-e | dudu-o f badat | FE (26) 
0 0 0 u 
We now set i 
dN, 
pt = g 7 (27) 
0 u 


and note that dNu(w) = 0 on the set {(w,u): Yulw) > 1} (in the sense of the 


equality L bulo) > 1)dN,(w) = 0, t > 0). Hence 


t dN, 
a= [Tu = DS", (28) 


u 


which shows even more clearly that the process (pt)+>0 changes the value only when 
the process (w+)¢y0 arrives at the boundary point {1}. 
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We claim that : 
| siasi Pas) (29) 
0 


for each t > 0. 
By Fubini’s theorem 


~ 


Ef Ty = 2du= f ET = 1) du= f P(Yu = 1) du = 0, 


because Ply = 1) = 0, which is a consequence of the fact that the distribution of 
the pair max Ws, Wu) has a density. 
s<xu 


Thus, the process (Wt)t>0 stays zero time (P-a.s.) at {1}, so that this point is 
an instaneously reflecting boundary ([239; Chapter IV, § 7]). 


4, THEOREM. Assume that \ > 0, a> 0, andy > 1. Then 


Be yyri y2 A 
P aae BEE” vcd, 
Ub) =U(p) = your — np? (30) 
Y — a, Y 2 Y, 
where 
A k [4X2 p 
m= Ste /(S) +B, k=1,2, (31) 
are the roots of the square equation 
y- Ay-B=0 (32) 
with 2 2) 
r 


the ‘threshold’ Y is the solution of the transcendental equation 
1 a 1 a 
nfi- >- — =yr(1-Ż-5) 33 
¥ ( yı a) y p (33) 
in the domain yw > a. Ifa = 0, then 


Iyi 
Fi. J= 


The stopping time se 
FE inf {t > 0: y y} 
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has the property Py(7T < oo) = 1, Y > 1, and it is optimal both in the class MgO 
and in Wt. 

As in §§ 2a, b we present two proofs, one (‘Markovian’) based on the solution of 
a Stephan problem and another based on ‘martingale’ considerations. 


The first proof. The same arguments as in §§2a,b, based on the represen- 
tation (22) in §2a, show that the domain of continued observations C and the 
stopping domain D in (15) must have the following form: 


C= {p >1: 4<} = {b>1:0W)>9¥)} 
and pi a zx 
D={yp>1: y>} = {4 > 1: O) = g)}, 


where g(y) = (Y — a)*. 
As in §§2a,b, U(w) and the unknown threshold % make up a solution of the 
Stephan problem 


LO(p) = aÔ), 1<¥v<y, (34) 
O'(+) =0, (35) 
Ôl) =g), ved, (36) 

dU (y) dg() 


= , 37 
dp lp W ig Pa 


where the operator L is defined by (19). 
We seek a solution to (34) in the form (y) = %7. Then we obtain square 
equation (32) for y, which has the roots yı < 0 and y2 > 1 described by (31). 
Thus, equation (34) in the domain {y: 1 < ¢% < Y} has the following general 
solution: 


OW) = 1y + co”, (38) 
where cı and c2 are some constants. 
To find y and the constants c} and c2 we can use three additional condi- 
tions, (35), (36), and (37), which, in view of (38), take the following form: 
cy + c2y2 = 0, (35% 
cY + cY? =p — a, (36°) 


ayh! + egqgb?} = 1. (37) 
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By (36’) and (37’), 


_ ap t(l— ya) OT ee (1 ~ mye (39) 
(y1 = y2)¥7 (y2 = y1) 4 
By (35), 
c1 = -op (40) 


which yields equation (33) for Y. If a = 0, then it follows from this equation that 


a 
TEFL 


z nzi 
yı y2-1 


(41) 


Finally, from (38) and (39), in view of (33), we see that 


ii 3 ~ RY” = ny” 
O E 


in the domain C = {y: Y < P}. 
We now claim that Py(7 < œ) = 1 for y > 1. To this end it suffices to show 
that 


sup Sy 
P(su(*S ) > i) =1 (42) 
ro\ Se 
for each ¢ > 1. (Property (42) is obvious for w = 1.) 
We have 
sup Su 
ust 
S a Y, 
Sı exXP řł, 
where 


Y = supo (Wu W) 4 (~ vale ol; 


We consider the sequence of stopping times (0%), 50 such that 
oo = 0, 
ol = inf {t >i:Yy= 0}, 
Ok+1 = inf{t 2oķk +l: Y = 0}... , 
Then for ¥ = Iný > 0 we have 


fu: sup Y; (w) > a} = U fw: sup VYlw) = a}. 
tgo k>0 OkStLOk+1 
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The events {w: sup Y%(w) > 7} are independent for distinct k and their 
Ok St<Ok+1 


probabilities are equal and positive. Hence Pfu: sup Y (w) > a} = 1 by the 
t<oo 
Borel—Cantelli lemma, so that Py,(7 < œ) = 1. 


It remains to show that 7 = inf{t > 0: y > Y} is the optimal time in the 
problems (15) and (16). 

One way to prove this is to test properties (A) and (B) in § 2a, which can be 
done in the same way as for call and put options. (See [444] for the detail.) We now 
present another proof, based on ‘martingale’ considerations (cf. § 2a.5 and [32]). 


The second proof. We assume for simplicity that a = 0. (As regards the general 
case of a > 0, see [32].) 
We set 
M; = ec veh(r) (43) 


and find a function h = h(w), Y > 1, such that M = (M:)ts0 is a local martingale 


with respect to the measure P. 
Using Ité’s formula for e~**yh(Wz) we see that 


d(e~™ weh(ve)) = eye [At dt + By(—o dW; + dyz)], (44) 
where 

At = (A + rA) + (0? = rM (be) + 507 PA" (We), (45) 

B; = h' (4) + (ve). (46) 


By (44), to make M = (M;)¢50 a local martingale we can take h = h(w), Y > 1, 
equal to the solution of the problem 


SOPYPh (U) + (0? — rN) AAE p> a7) 


with boundary condition 
K (1+) + R(1+) = 0. (48) 


We rewrite (47) as 


pw +2(1- i jw- >E) =0 (49) 


g2 


and seek its solution in the form A(w) = %7. Then we obtain the following square 
equation for z: 
z? +a(1—2r)—2(A +r) =0. (50) 
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Comparing it with (32): 
y? — y(1 + 2r) - 2d = 0, (51) 


we see that (51) turns into (50) after the change y = x + 1, so that the roots z; 
and y; of these equations are related by the formulas y; = z; +1, i = 1,2. 
The general solution of (49) (for ø? = 1) is as follows: 


A(b)= dw? +dy™, p>. 


We consider A = h(w) such that (48) holds. Then 


d = Ltt 72 
= ee 
T2 — %1 Y-Y 
14+ 
Sp cape SN 
T1 — T2 Y1 — N2 
Consequently, 
1 2 = 
h() = [at — yr]. (52) 
YOY 


We have y1 < 0, y2 > 1, and k’ (y) = 0 for y = a, where Y can be found from the 
equation 
Sany 21) 
aye Te], 53 
i= 1) 8) 
Comparing (53) and (41) we observe that the quantity W is the same as ~ in (41). 


Moreover, the function A(w) takes its minimum at the point Y. 
We see from (43)-(48) that for the function h = A(w) so chosen the process 


M; = eo deh(v2), t>0, 


is a nonnegative local martingale, and therefore a supermartingale. Hence for each 
T EME and Wo = 1 we have 


Eye wy, = Eiht pr) Mr < Eh L(G) M; 


nwa wa 


= ht (p)E,M, < h7} (eb) Ey Mo = ho!) 
ya >N apes ee 
yoy! — yyl yap — yy? 


If yo = 1, then 7 = inf{t > 0: y > Y} is finite with probability one as shown 
above (P1(7 < œ) = 1), and 


Eye yp = Eih pe) Me = AHG) (=01)) 
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for this stopping time, which means precisely the optimality of 7 in the class 35° 
for Yo = 1. (Similar arguments hold for each Yọ < Y.) 
We proceed now to our original problems (6) and (7). By (11), setting a = 0 we 
obtain 
Exe TOH) g (S)I(r < 00) = Ee yri (T < co). (54) 


Here yo = 1 and, as shown above, F = inf{t: y > Y} is the optimal stopping time 
in the following sense: 


sup Ee yy, I(r < œ) = Ee? el (TF < œ) = Ee a (55) 
TEMS: 
and P PES 
sup Ee™™y,I (r < 00) = Ee ye. (56) 
TEM 


Hence 7 is the optimal stopping time in the problem (7). 

Arguments used above, in the proof of the property Py(F < œ) = 1, are suitable 
also for the analysis of the process (wWz)t>0, where they show that 7 is finite almost 
everywhere also with respect to P. Hence 7 is an optimal stopping time in our 
original problems (6) and (7). 


3. American Options in Diffusion (B,5)-Stockmarkets. 
Finite Time Horizons 


§8a. Special Features of Calculations on Finite Time Intervals 


1. If the time horizon is infinite, i.e., the exercise times take values in the set [0, 00), 
then one often manages to describe completely the price structure of American 
options and the corresponding domains of continued and stopped observations. For 
instance, in all the cases considered in § 2 we found both the price V*(x) and the 
boundary point z* in the phase space E = {x: s > 0} between the domain of 
continued observations and the stopping domain. 

We note that this is feasible because a geometric Brownian motion S = (S¢)t>0 
is a homogeneous Markov process and there are no constraints on exercise times, 
so that the resulting problem has 

elliptic type. 

The situation becomes much more complicated if the time parameter t ranges 
over a bounded interval [0, T]. 

The corresponding optimal stopping problem is inhomogeneous in that case and 
we must consider problems that have 

parabolic type 
from the analytic standpoint. As a result, in place of a boundary point z* we en- 
counter in the corresponding problems an interface function «* = z*(t), 0< t <T, 
separating the domain of continued observations and the stopping domain in the 
phase space [0,T) x E = {(t,2):0 < t < T, x > O}. (Cf. Figs. 57 and 59 in 
Chapter VI, §§ 5b, c.) 

We also point out that, although the theory of optimal stopping rules in the 
continuous-time case (see, for instance, the monograph [441]) proposes general 
methods of the search of optimal stopping times, we do not know of many concrete 
problems (e.g., concerning options) where one can find precise analytic formulas 
describing the boundary functions z* = x*(t), 0 < t < T, the prices, and so on. 
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In practice—e.g., in calculations for American options in some market—one 
usually resorts to quantization (with respect to the time and/or the phase variables) 
and finds approximate values of, say, the boundary function and prices by backward 
induction (see Chapter VI, § 2a). 

Of course, this does not eliminate interest to exact (or ‘almost’ exact) solutions: 
in this direction we must first of all discuss several relevant questions of the theory 
of optimal stopping problems on finite time intervals and, in particular, one very 
common method based on the reduction of such problems to Stephan problems 
(often called also free-boundary problems) for partial differential equations. 


2. For definiteness, we consider a (B, S)-market described by relations (1) and (2) 
in § 2a, where we assume that the time parameter t belongs to [0,7], u = r, and 
the pay-off functions have the form ft = et g(St), where à > 0, the Borel function 
g(x) is nonnegative, and z € E = (0,00). 

Let 


V(T,2) = Bote Z (1) 
Br 
and 
VE eS he aap B22 (2) 
TEME Bz 


be the rational prices of European and American options, respectively. In re- 
lations (1) and (2) the symbol Ey means averaging with respect to the original 
measure (which is martingale since u = r) under the assumption Sọ = a. 


Remark. We proved formula (1) for V(T,x) in Chapter VII, § 4b. The proof of (2) 
is based on the optional decomposition and its idea is the same as for discrete 
time (see Chapter VI, §§ 2c, 5a). See, for instance, [281] for the detail specific to 
continuous time. 


3. Fort > 0 and z € E = (0,00) we set 


V(t, £) = Exe *g(S;) (3) 
and 
V*(t,c) = sup Eze 9"g(S;), (4) 
TEMË 


where 8 = À +r and z = So. 
In the discussion of the case where t € [0, T] it is also useful to introduce the 


functions 
Y(t,c) =V(T — t,x) (5) 


and 
¥*(¢,2) = V*(T — t,x), (6) 


where T ~ t is the remaining time. 
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Clearly, 
Y(t, £) = Erse PTH g( Sp) (7) 
and 
¥*(t,2) = sup Eyre PTH 9(S,), (8) 
remy 


where E;,, is averaging with respect to the original (martingale) measure under the 
assumption S; = x, and MT is the class of optimal stopping times + = T(w) such 
thatt<7r<T. 

We considered the functions V = V(t, z) (in the case of a Brownian motion and 
for 8 = 0) in Chapter III, §3f, in connection with the probabilistic representation 
of the solution of a Cauchy problem. The same arguments (see Chapter II, §3f.5 
for greater detail) show that the function V = V(t, x) (provided that it belongs to 
the class Cl) satisfies for t > 0 and z € E the equation 


oV 


ZE = LV 
ap t 8V = LV, (9) 
where ay ey 
- Dee Y 
LV(t,£) = rEg taO T ga (10) 
with initial condition 
V (0, £) = g(a). (11) 


By (5), (9), and (11) we obtain that the function Y = Y (t, x) satisfies for t < T 
the equation 


oY 
-= Y = LY 12 
Tp (12) 
and the boundary condition 
Y(T, x) = g(x). (13) 


Recall that we met already fundamental equation (12) (which is in fact a Feyn- 
man-Kac equation—see Chapter III, § 3f) in § 1c, in our discussion of the methods 
used by F. Black, M. Scholes [44], and R. Merton [346] to calculate the rational 
price (V (T, £) = Y (0, £)) of a standard European call option with g(x) = (x — K)t 
and à = 0. 


4. We proceed now to the question of the rational price V*(T, x) = Y*(0,z). 
We set 
TE = inf{0 < t < T: Y*(t, St) = 9(St)} (14) 


and 
DT = {x € E: Y*(t,£) = g(x)}, (15) 
CT = {x € E: Y*(t,£) > (x)}. (16) 
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For s < t we have Y*(s,z) = V*(T—s,2) > V*(T — t,£) =Y*(t,x). Hence for 
O0<s<t<T we obtain 
Dj CDS CDP 


and 
cg 2c 2 cf. 


For t = T we clearly have DE. = E and CT s: 
The domains 
DT = { (t,x): t € [0,T), £ € DT} 


and 
CT = { (t,£): t € [0,T), s€ CF} 


in the phase space [0, T) x E are called the stopping domain and the domains of 
continued observations, respectively. This is related to the fact that in ‘typical’ 
optimal stopping problems for Markov processes the stopping tiine re is optimal 


(see, e.g., Theorem 6(3) in [441], Chapter III, § 4): 
Exe 97 g(S,7) = V*(T,2). (17) 


Since 
rE =inf{0 <t <T: S€ DT}, (18) 


it is clear why we call DT = U ({t} x DT) the stopping domain: if (t, St) € DT, 
t<T 


then the observations terminate. (It seems reasonable to include in this domain 
also the ‘terminal’ set T x DZ =T x E.) 

The domains DT and CT can have very complicated structure depending on 
the properties of the functions g = g(x) and, of course, of the process S = (St)i<T- 
For instance, as subsets of [0, T) x E these domains can be multiply connected or 
consist of several stopping ‘islands’, and so on. 

On the other hand, for standard call and put options with g(x) = (£ — K)* 
and g(x) = (K —x)*, respectively, the domains DT and CT are simply connected 
{see § 3c below). 

For these options the boundary OD™ of the stopping domain can be represented 
as follows: 

apt = { (t,x): t € [0, T), c= 2*(t)}, 


where we have 
z*(t)=inf{z € E: Y* (t,£) = (z— K)*}, 


for a call option and 
z* (t) = sup{z € E: Y* (t,£) =(K—a)*} 


for a put option. 
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§3b. Optimal Stopping Problems and Stephan Problems 


1. It follows from our discussion that for a description of the structure of the 
optimal stopping time Te and the domains of continued and stopped observations 
one must find the function V* = V*(t, x) or, equivalently, the function Y*(t, £) = 
V*(T —t,2). 

There exist various characterizations of these function in the general theory of 
optimal stopping rules. 

For instance, it is known (see [441]) that the function Y* = Y*(¢, x) is the small- 
est 3-excessive majorant of the (nonnegative Borel) function g = g(x). In other 
words, the function Y* = Y*(t, x) is the smallest among all functions F = F(t, x) 
such that 

e PAT, F(t,x) < F(t,2), TEE, (1) 


for 0 <t <t+A <T, where TA F(t, £) = Et sF (t + A, Sta), £= St, and 
g(x) < F(t, 1), rEE, 0<t<T. (2) 
It follows, in particular, that 
max{g(zx), e PATAY *(t,2)} < Y* (t,£). (3) 


For small A > 0 and ¢ = 0. A,...,[T/A]A one would expect the function 
Y*(t,z) to be ‘close’ to 


YA(t,z)= sup Ere (7-4 g(S,), (4) 
remt (A) 
where mir (A) are stopping times 7 such that T = kA, k = 0,1,..., [T/A], t<7<T, 


and {w: T S kA} E Fpa(A), Fea(A) = of{w: Sa, Soan,---, Ska} 

As shows the theory of optimal stopping rules, this conjecture on the ‘closeness’ 
of these functions for small A > 0 has a rigorous formulation (see [441; Chap- 
ter III, §2]). Further, for YÄ (t,£) with t = 0, A, .. . , [T/A]A we have the recursive 
relations 

YÄ (t, £) = max{ g(x), e PAE YÄ (t + A, Siia)} (5) 


(see the discrete-time case in Chapter VI, § 2a and [441], Chapter II, § 4), therefore, 
assuming that Y*(t,x) is sufficiently smooth and using Taylor’s formula we see 
from (5) that 


¥*(t,2) = max g(a), (= BA) [rea + (see + LY*(t,2)) al + o(a) }, 


(6) 


where 


* 2y* 
oe (t,x) + 1 2,29 Y*(t, 2) 


LY (t,x) =r Or 2 ðr? a (7) 
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By (6) we obtain that in the domain where Y*(t, £) > g(x) (i.e., in the domain 

of continued observations) the function Y* = Y*(t, x) satisfies the equation 
oy* 
Ot 


+ BY* = LY*. (8) 


Remark 1. Equation (8) is similar in form to (12) in § 3a; this is not that surprising 
for the following reasons. 
Assume that there exists optimal stopping time rE in the class MT. Then 


¥*(t,2) = Ense O OS r): 


Since 
Y (t,£) = Epse PTH g(Sr), 


it is intuitively clear that if (t, £) € CT, then the (backward) equations with respect 
to t and æ for the functions Y*(t,x) and Y (t,x) must be the same since their 
coefficients are determined by the local characteristics of the same two-dimensional 
process (u, Su)tcucT in a neighborhood of the initial point (t, £ = S+). 


Remark 2. There exists an extensive literature devoted to the derivation of equa- 
tion (8) for Y*(t, x) in the domain of continued observations; e.g., the monographs 
[266], [287], [441], [478] and the papers [33], [66], [134], [135], [179], [247], [265], 
[272], [340], [363], [467]. 


2. We considered already the connection between optimal stopping problems and 
Stephan problems in Chapter IV, §5, in our discussion of American options in a 
binomial (B, S)-market. In the continuous-time case this connection has apparently 
been revealed for the first time in statistical sequential analysis, in testing problems 
for statistical hypotheses about the drift of a Wiener process ([67], [300], [349], [440]; 
see also the historico-bibliographical passages in [116] and [441]). 

One of the first papers in the finances literature considering Stephan (free- 
boundary) problems was that of H. McKean [340], devoted to the rational price 
of American warrants. 


3. In mathematical physics, the Stephan problems arise in the study of phase tran- 
sitions ([413], [463]). One simple example of such a two-phase Stephan problem is, 
for instance, as follows. 

Assume that our ‘time-state’ space Ry xE = {(t, x): t > 0, x > 0} is partitioned 
between two phases, 


CY = {(t,2):t>0, 0<a2<-x(t)} 


and 
c2) = {((2)i t> 0, c(t) <£ <œ}, 
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where x = x(t), t > 0, is the interface (e.g., between ice and liquid in freezing 
water). For each phase C® the temperature u(t,z) at x at time t, 7 = 1,2, i 
assumed to satisfy ‘its own’ heat equation 


Ou Ou . 
Gipi g = Mi BT > t= 1,2, (9) 


where (using physical terminology) the c; are the specific heats, the p; are the 
densities of the phases, and the k; are the thermal conductivities (see, e.g., [335; 
vol. 5, p. 324]). 
Equations (9) are considered for 
the boundary condition u(0, t) = Const, 

the initial condition u(x, 0) = Const 
and, e.g., for the following 

condition at the interface surface: 


u(t, z(t-)) = eure (10) 
oe la, x(t—)) ag a(t+)), (11) 


for t > 0, with additional assumption z(0) = 0. 
Under these assumptions the Stephan problem consists in finding the interface 
= x(t), t > 0, and the function u = u(t,z) describing the temperature schedule 
in the phases. 


4. We have presented an example of a Stephan problem borrowed from mathemat- 
ical physics to emphasize its common and distinct traits in comparison with the 
Stephan problems arising in the search of optimal stopping rules and, in particular, 
in connection with American options. 

As already mentioned, for standard put and call options we also encounter a 
two-phase situation: looking for optimal stopping rules we can restrict ourselves to 
the consideration of only two simply connected phases, the domain CT of contin- 
ued observations, where Y*(t, x) satisfies equation (8), and the domain DT, where 
Y* = Y*(t,x) coincides with the function g = g(x). 

In the next section we formulate precisely the corresponding Stephan problems 
for these two options and describe the qualitative features of the corresponding 
solutions Y* = ¥*(t,x) and 2* = q* (t). 


§3c. Stephan Problem for Standard Call and Put Options 


1. A call option. We assume that a (B,S)-market can be described by re- 
lations (1) and (2) in §2a with u = r, 0 < t < T, and the pay-off function 
fe = e7%*g(St), where À > 0 and g(x) = (x — K)+, z € E = (0,00). The main 
results about this option are as follows. 
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1) The rational price V*(T, x), x = So, of such a (discounted) option is defined, 
as mentioned in § 3a, by the formula 


V*(T,x2) = sup E,e~P7 9(S,), (1) 
TEM 


where 8 = À +r and E, is averaging with respect to the original (martingale) 
measure under the assumption Sọ = 7T. 
2) For t € [0, T] and z € E let 


Y*(t,z)= sup Es se 7P 9g (S), (2) 
rEMT 


where E; z is averaging with respect to the (martingale) measure under the assump- 
tion x = S4. 

The function Y* = Y*(t,«) is the smallest 3-excessive majorant of the function 
g(x) (see §3b.1). 

3) The rational price is 


V*(T, £) = Y*(0,2), (3) 
and the rational time for stopping the buyer’s observations and exercising the option 
is 

tp = inf{0 < t < T: Y*(t, St) = g(S)} (4) 
(we denoted this time by Te in § 3a) or, equivalently, 


rh = inf{0 < t < T: (t, S4) € DT U {(T, 2): £ € ES}. (5) 


4) The stopping domain DT and the domain of continued observations CT are 
simply connected and have the following structure: 


DT= |) {(t2):¥*(t,2) = 9()}, (6) 
O<t<T 

CT= |] {(2): ¥*(,2) > g(2)}. (7) 
O<t<T 


5) The function Y* = Y* (t,x) on [0, T) x E belongs to the class C1. 

For each fixed x € E the function Y*(-, x) is nonincreasing in t; for each fixed 
t € [0, T) the function Y*(¢, -) is nondecreasing and convex (downwards) in z. 

6) The interface function c* = x*(t) is nonincreasing on [0, T), and the sets Ge 
and DT have the following form for t < T: 


CF = {x € E: Si <2*(t)}, 
DT = {x EE: & x*(t)}. 
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For t = T we have C} = Ø and Di = E. 
If à = 0, then z* (t) = oo for t < T, which corresponds to the equalities 


CT =E, DT =Ø 


for each ¢ < T. In other words, for t < T observations should go on what- 
ever the prices S; can be, which is a consequence of the fact that the process 
(e~"'(S; — K)*t)e>o is a submartingale, so that by Doob’s optimal stopping theo- 
rem, 


Eze” (S, — K)* < Eze T (Sp — K)* 


for each 7 € MF. 

A similar result is known in the discrete-time case (R. Merton, [346]), and we 
have interpreted it as follows (see Chapter VI, §5b): the standard American and 
European call options ‘coincide’. 

7) The function Y* = Y*(t, x), t € [0,T], x € E, and the interface function 
z* = x*(t),0 <t <T, are the solutions of the following ‘two-phase’ Stephan (or 
free-boundary) problem: 


_ OY*(t, 2) 
ot 
in the domain CT = {(t,x): £ < x*(t), t € [0,T)}; 


4+ 6Y*(t,2) = LY*(t,z) (8) 


Y*(t, £) = g(x) (9) 


in the domain DT U {(T, x): £ € E}; while at the interface surface z* = «*(t), 
0<t<T, we have the Dirichlet condition (P. G. L. Dirichlet) 


Y* (t,2*(t)) = g(x*(t)) (10) 
and the Neumann-type condition (K. Neumann) 


OY*(t,z) 
Ox 


_ 29(2) 
ata*(t) dx 


(11) 
xl x*(t) 


(called above the condition of smooth pasting). 


We shall now make several observations concerning the above results; for detailed 
proofs of these results see the papers mentioned at the end of §3b.1. 

We discussed already the validity of (1) at the end of § 3a. The fact that T% is 
optimal follows from the general theory of optimal stopping rules (see, e.g., [441; 
Chapter III, §3]). As regards the smoothness of Y*(t, x) and the derivation of (8), 
see, e.g., [247], [363], and [467]. 
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While condition (10) is fairly natural. condition of smooth pasting (11) is not 
that obvious. In [200] and [441; Chapter III, §8] we have shown that (11) must 
hold at the boundary of the stopping domain under rather general assumptions. 

We also recall that we have already encountered conditions of smooth pasting in 
our considerations of approximations in discrete-time problems (§ 5 in Chapter VI) 
and in our discussions of American options in the case of an infinite time horizon 
(§ 2 in this chapter). 

It is worth noting that while in the standard Stephan problem of mathematical 
physics that we considered in the previous section each phase satisfies an equation 
‘of its own’, in the optimal stopping problems we have a differential equation for 
Y* (t,x) only in one phase (in the domain of continued observations), while in the 
other (the stopping domain) Y*(¢,x) must coincide with the fixed function g(z). 
dg(x) 

7 gz}a* (t) 
x*(t) > K,0<t<T. (It is easy to deduce the last inequality from the fact that 
Y* = Y* (t,x) is a B-excessive majoraut of the function g = g(z).) 

As concerns the solubility of the Stephan problem (8)-(11) and the properties 
of the interface function z* = x*(t) see [467] and [363] (see also the comments to 
the second paper). 


We note also that = 1 in our case of g(x) = (x — K)* because 


2. Put options. In that case g(x) = (K —x)*. Properties 1)-4) remain valid 
and the function Y* = Y*(t,z) is in the class Cl? again. For each fixed z € E 
the function Y*(-,2) is nonincreasing in t; for each fixed t € [0, T) the function 
Y*(¢, -) is nonincreasing and (downwards) convex in z. 

For each \ > 0 the sets CT and DT have the following form for t < T: 


CT = {ees owt) 
DE = {z: S% S z*(t)}. 
For t = T we have OF = @ and Ds = BE. 
The interface function s* = X*(t) is nondecreasing in t; if A = 0, then 
lima*(t) = K. 
nT 
The Stephan problem for Y*(t,x) and x*(t) can be formulated in a similar 


manner. Here conditions (8), (9), and (10) are the same, while (11) takes the 
following form forO <t<T: 


dY*(t,2)|_ dg(a) 
dx rle*(t) dx ztx*(t) i 
where 
dg(zx) day 
dx zîr* (t) 


because g(x) = (K — x)* and z*(t) < K. 
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One can find additional information on the properties of the functions Y*(¢, x) 
and «x*(t) in the paper [363] devoted to standard American put options and con- 
taining an extensive bibliography on other types of options. 


§3d. Relations between the Prices 
of European and American Options 


1. We have previously mentioned that American options are in fact more commonly 
traded than European ones. However, while for the latter we have such remarkable 
results as, e.g., the Black-Scholes formulas, calculations for American options in 
problems with finite time horizon encounter great analytic difficulties. On the final 
count, this is due to difficulties with the solution of the corresponding Stephan 
problems. 

It is clear from (1) and (2) in §3a that V*(T,x) > V(T, <x); this is, of course, 
fairly natural since, by condition, American options encourage the buyer not merely 
to wait for the execution but to choose its timing. 

In this section we present several results on the relations between the prices 
of standard call and put options with pay-off functions g(x) = (x — K)* and 
g(x) = (K — x)*, respectively. 

We shall assume that À = 0. Then formulae (1) and (2) in § 3a can be written 
as follows: 


V(T, £) = Exe~"" g(Sr) (1) 
and 
V*(T, 2) == sup Ere ""9(S;), (2) 
remd 


where x = So. 
2. The question of the relation between the prices V(T,x) and V*(T, <) for 
g(x) = (x — K)* is very easy to answer. In that case 


V(T,2) = V*(T, x), (3) 


and r+ = T is an optimal stopping time in the class MF (see § 3c). 


Remark. We emphasize that if A > 0, then (3) does not hold any longer because 
the process (e~+")#(S, — K)*)t>0 is no longer a submartingale (cf. § 3c). 
We proceed now to the question of the size of the ‘deficiency’ 


Ag (x) = V*(T, £) — V(T,z) (4) 
for a standard call option (g(r) = (K ~ x)*). We set \ = 0 and denote by 


c* = x* (t), 0 < t < T, the function defining the interface between the stopping 
domain and the doniain of continued observations for optimal stopping time Tp. 
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THEOREM. For a standard call option, 


T 
Ad (£) = rKE; | e™™"I (Su < £*(u)) du. (5) 


COROLLARY 1. Let Pr and P% be the rational prices of European and American 
call options (Pr = V (T, So), P} = V*(T,So)). Then 


eT 
P n Pr + rKEs, J eB (Su < z*(u)) du 
J0 


T 
= Ke~"T @(—y_) — Soð(—y+) + rk f e""&(—y_(u,2*(u))) du, (6) 


where (cf. the notation in § 1b) 


2 
In 20 + T(r + Z) 
Yt kg oVT 
and 
In + u(r Zz 
x z*(u 2 
y-(u,2*(u)) = = __ (7) 
Proof. Let 
Y ZË —r(T—t) S 
(t,x) = te g(Sr) (8) 
and let 
¥*(t,2) = sup Epsze""—"9(S,), (9) 
rem? 
where g(x) = (K — x)*. Then for 
A} (a) = Y¥*(t,z) — Y(t, 2) (10) 
we have E 
e TAT (a) = Ers {e7™™ 9(S,7) - eT 9(Sr)}, (11) 


where rE is the optimal stopping time in the problem (9). 
By Itô’s formula (Chapter II, § 5c) 


d(e "(K — Sy)t) =e7"" d(K — Sy)t — re™™(K — Su)” du, (12) 
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and by the Ité-Meyer formula for convex functions (see Chapter VII, § 4a; [395, 


Chapter IV]; cf. also Tanaka’s formula (17) in Chapter III, § 5c) 
1 
d(K — Say = —I(Sy < K)dSy+ gbulk), 


where 


1 u 
L = lim — I — < 
u(K) imz f (|S: — K| <e) dt 


is the local time of process S = (St)¢30 (on [0, u]) at the level K. 
By (11)-(13) we obtain 


ET te) 


T 
ae f, d(e~""(K — Sy)*) 


t 


T 
-E; z [ota < K) dSy+ 4dL,(K) 
Je 
—r(K — Su)I(Su < K) du} 


Etx 


7 


J. e-{ —dLu(K) + I(Su < K) 


x [rSu du + o Su dWu + (rK —rSu) dul} 


T 
= Ete P e™{rKI(Sy < K) du — dLu(K)}. 
Jri 
For t < T we set 


ol 
At = ie e~™ {rKI(Sy < K)du—dLy(K)}. 


0 


Then, since oe = T, it follows from (15) that 
eT AT (x) = Etel AT = Aj]. 
We represent now A; as follows: 
Ai = Al + AZ, 
where 
f 
Al = ie err (Su < z* (u)) {rK I (Su < K)du— dLy(K)}, 


T 


a= f * aT (Su > a*(u)){rKM(Sy < K) du — dLu(K)}. 


(13) 


(14) 


(15) 


(16) 


(17) 
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Since z*(u) < K for all u < T, it follows that 


T 
E Tt -ru * 
A; = re I(Su < x*(u))rK du 


0 


t 
= rk f e™™I (Su < «*(u)) du. (18) 


The process A! = (Alter is a predictable submartingale and, by Corollary 2 
in Chapter III, § 5b, it is the compensator of itself. A slightly more refined analysis 
(see [134], [135], and [363]) shows that the compensator of the process A? = (A?)i<T 
vanishes. Hence 


eT AT (z) 


Evc(Ar — At] = Er2[Ap — Aj] 


I 


T 
rKEne | e™™™I (Su < 2*(u)) du (19) 
t 


so that (5) holds for Ag (2). 
Finally, formula (6) in Corollary 1 is a consequence of (5) and relation (18) 
in §1b for Py. This completes the proof of the theorem and Corollary 1. 


COROLLARY 2. The functions Y* (t,x) and x*(t) are connected by the relation 
Y*(t,2*(t)) = K —2x*(t), t<ST, (20) 


which can be regarded as an integral relation for the interface function «* = x* (t), 
t <T, with z* (T) = jim. x*(t). 


It should be pointed out that the function Y*(t, x) is in fact also unknown. In 
place of this function one considers in practice approximations YÄ (t, x) calculated 
by means of backward induction (see §3b.1). Replacing Y*(t, £) by YA (t,x) in (19) 
we obtain a function x4 = T} (t), t < T, which is taken for an approximation to 
z*=a*(t),t<T. 


4. European and American Options 
in a Diffusion (B,?)-Bondmarket 


§4a. Option Pricing in a Bondmarket 


1. So far we have considered only options in (B,5)-stockmarkets, whereas one 
encounters most various kinds of them in practice: options on eurodollars, futures, 
currency, etc., even options on options. Besides standard put and call options their 
various combinations are also traded. Many financial instruments in the option 
family have an elaborate structure as regards the corresponding pay-off functions 
and the securities underlying option contracts. 

The diversity of options, some of which are said to be ‘exotic’, can be judged 
by their names: up-and-out put, up-and-in put, down-and-out call, down-and-in 
call, barrier option, Bermuda option, Rainbow option, Russian option, knock-out 
option, digital option, all-or-nothing, one-touch all-or-nothing, supershares, ... 
(see [232], [414], or [415]). 

In our discussions of pricing for these options and other derivative instruments 
we should point out that the general methods remain the same as in the models 
considered by F. Black, M. Scholes, and R. Merton ((44], [346]) in the case of a 
(B, S)-stockmarket. Again, two approaches are possible: the martingale approach 
and the approach based on direct considerations of the ‘fundamental equation’ 


(cf. §§ 1b, c). 


2. The discussion that follows relates to calculations for standard European and 
American options in the case where in place of a (B,S)-market we consider a 
(B,P)-market consisting of a bank account B = (B:);<r and one bond of matu- 
rity T with price described by a (positive) process P = (P(t,T)):<7 satisfying the 
condition P(T, T) = 1. 

In accordance with Chapter III, §4a and Chapter VII, § 5a, we shall take for 
our description of the (B, P)-market the indirect approach, wliere we assume that 


4. European and American Options in a Diffusion (B, P)-Bondmarket 793 


the state of the bank accouut (Bilicr can be expressed by the forinula 


B; = Bo exp a r(s) ds) (1) 


with some stochastic process of interest rate r = (r(t))rcr- 
As regards the dynamics of the bond price process P = (P(t, T’))¢<7 we assume 
that the discounted prices 


t<ST, (2) 


make up a martingale with respect to the initial measure on (Q, ¥7,(Ft)tc7)- 
By Theorem 1 in Chapter VII, § 5a we obtain 


P(t, T) = E(ex(- [oro as) |a); (3) 


and by Theorem 2 in the same section the (B, P)-market in question is arbitrage-free 
(e.g., in the NA4-version of this concept). 


3. From (1) and (3) we see that the dynamics of the processes (Bi)e<r and 
(P(t,T))¢cr in our (B,P)-market depends considerably on the structure of the 
process r = (r(t))rcr- 

Our main assumption about this process is that it is a diffusion Gauss-Markov 
process described by the stochastic differential equation 


dr(t) = (a(t) ~ A(t)r(t)) dt + a(t) dWr, (4) 


with Wiener process (W;);<7 and the (nonrandom) initial condition r(0) = ro. We 
assume that the functions a(t), (t), and y(t) are deterministic and 


"T 
[ (|a(t)| + IB) + ¥7(#)) dt < œ. (5) 
Under these assumptions equation (4) has a unique (strong) solution 
ie a, OE 9S) 
garoti o fsa) j 
where 
t 
a(t) = exn(— f aoas) (7) 


is the fundamental solution of the equation 


$ 
TE A A(s)g(s) ds. (8) 
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Remark 1. Recalling our discussion in Chapter III, § 4a we see that (4) is just the 
Hull--White model, of which the Merton, the Vasiéek, and the Ho—Lee models are 
particular cases (see formulas (14), (7), (8), and (12) in Chapter III, § 4a). 


4. Since r = (r(t))¢<7 is a Markov process, it follows that 


P(t, T) = Efexp(— [ reas) 


T 
We set I(t, T) = J r(s)ds. Then we see easily from (6) that 


E(I(t, T) =r f aus f i oS als) as du, (10) 


D(I(, T) |r af if , (oju) ae (11) 


Hence it is easy to obtain by (3) the following representation for 


rio). (9) 


P(t, T) = E[exp(—1(t,T)) |r(t)] = exp( DUET) [r(t)) ~ EHE, T) 1(8))) : 


P(t, T) = exp(A(t, T) — r(t)B(t,T)), (12) 
where 
FIE oa a Lf ate 
Aut) =5 f / a (8) du! ds [Uf as) ds) du (13) 
and 
_ f* glu) 
BET = f Sp du (14) 


Remark 2. Using the terminology of Chapter III, §4c we call the models with 
prices P(t, T) representable as in (12) single-factor affine models. Our additional 
assumption that r = (r(t))¢<7 is a Gauss-Markov process enables one to carry out 
fairly complete calculations for European and American options in (B, P)-markets 
for such models (often called also single-factor Gaussian models). We devote § § 4b, c 
below to these problems. 
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Remark 3. As regards the agreement between various models describing the dy- 
namics of stock prices and empirical data, see, for instance, [257]. 


§4b. European Option Pricing in Single-Factor Gaussian Models 


1. We sliall consider a model of a (B, P)-market consisting of a bank account and a 
bond and completely determined by a single factor, the interest rate r = (r(t))tcr 
that is a Gauss-Markov process satisfying stochastic differential equation (4) in § 4a 
with (nonrandom) initial condition r(0) = ro. 

Let T° be some time (T° < T) treated as the time of exercising a European 
option with pay-off function fro = (P(T°,T) — K)+ for a call option and ft = 
(K — P(T°,T))+ for a put option. 


THEOREM. The rational price C°(T°,T) of a standard call option in the single- 
factor Gaussian model of a (B, P)-market in question is described by the formula 


Co(T°, T) = P(0,T)®(d,) — KP(0, T°) ®(d_) (1) 
where 
POT) 1 
3 ln KPO, T0) = T T)B?(T°, T) 0) 
Sm o(T°, PDTT, T) i 
Üp g(u) 
B(T?,T) = ie aT) du, (3) 


aon (EIE oaa 
s= exp- [ H (5) 


The rational price P? (Fo; T) ofa standard put option is described by the formula 


P(T}, T) = KP(0,T°)®(—d_) — P(0,T)®(—d,) (6) 


Before proceeding to the proof of (1) and (6), we point out that these relations 
are very similar to the formula for the rational prices C(T) and P(T) in the case of 
stock (see (9) and (18) in §1b). 
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This is not that surprising given that, similarly to the prices S; in the Black- 
Merton-‘Sclioles model, the prices P(t, T) in the present model have a logarithmically 
normal structure: 

ln P(t, T) = A(t, T) — r(t) Bt, T), 


where (r(t))tc7 is a Gaussian process, 


iene (r- Z t+ oW 


and (W;)¢<7r is a Wiener (and therefore also Gaussian) process. 

It is maybe more surprising that it took so long between the publication of 
the Black-Scholes formula in 1973 and 1989, when F. Jamshidian published the pa- 
per [256] containing formulas (1) and (6) for the Vasi¢ek model (a(t) = a, A(t) = £, 
y(t) = y; see (4) in § 4a and (8) in Chapter II, § 4a). We follow mainly [257] in our 
proof below. 


2. By pricing theory for complete arbitrage-free markets (see § 5 in Chapter VII), 
assuming that the initial probability measure on (Q,7,(¥i)tcr), Fr = F, isa 
martingale measure and setting 


R(t) = exp(- if “r(u) du), (7) 


we obtain 
C?(T?, T) = ER(T®°)(P(T°, T) — K)* 
= E(1(P(T°, 7) > K)R(T°)(P(T°, T) — k)) 


= E(1(P(T°,T) > K)R(T)P(T®,T)) — KE(1(P(T°, 7) > K) R(T). 
8 


Clearly, we have the coincidence of tlie events 
{P(T°,T) > K} = {A(T°,T) — r(T°)B(T,T) > nw K} = {r(T°) <r*}, (9) 
where 


_ Ink — A(T°,T) 
o —B(T?,T) 


* 


(10) 


and A(t, T) and B(t,T) are defined by (13) and (14) in § 4a. 
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Let 
T T9 
aE r(T®), n = | r(u) du, C= | r(u) du. 
0 0 
Then by (8) and (9) we obtain 


C°(T°, T) = E(I(E < r*)e™") — KE(I(E < r*)e"S). (11) 


For a further simplification of this formula we bear on the following result, which 
can be verified by a direct calculation (see [257; Lemma 4.2]). 


Lemma. Let (X,Y) be a Gaussian pair of random variables with vector of mean 


2 
$ A OF, P) 
values (ux, py) and covariance matrix ( KURAK ). Then 
PXY;, Sy 


EI(X < x)exp(—Y) = exp( 50% = uy) S(T) (12) 


and 
EI(X < £)X exp(-Y) = exp( 50? - uy): { (ux - exy)®@)-oxe@)}, (13) 


where 


xz — (ux —pxy) 


T a a ETI 
ox 
p(z) = -A a0) = f 7 p(y) dy. 


Taking formulas (6), (10), and (11) in § 4a into account it is easy to calculate 
that 


m=E f rdu=ro f soaus f TS" a a(s) ds du, 


wef r(u) du=ro f gejaut f [ais as du 
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and 


o? = Dr) = f £ 1) (MY as, 
T T T ii 
ru)du= f |/ a} 
=o [rome [hoa] a 
pec = Cov (ra, f : r(u) du) 


A i a 


Pen = Cov (nr), if Gn du) 
= cov (rer, [7 r(u) du) + Cov(r(T®), fw) du) 


T 
2 g(u) 
=pectoa — du. 
SON'S Jro g(T?) 


Se 
Il 
o 
s~ 


By (11) and (12) we obtain 


c°(T°, T) = E(I(E < r*)e™”) — KE(I(E < r*)e~S) 


Zolo mt = (He = Pen) 
= exp( 30 — Hn) Ie Pend 


-— K exp( 502 — uc) a(i ec), (14) 


Substituting here the above values of we, Hn, He, Ot, On, Oc, Pen, and pec, 
after some algebraic transformations (see [257; Appendix 4b]) we obtain required 
formula (1). 


Relation (6) is seen to be an immediate consequence of (1) once one takes into 
account the equality 


(K —P(T°,T))* = (P(T°,T) — K)* — P(T®, T) + K. 


(Cf., e.g., the derivation of (9) in Chapter VI, § 4d.) This completes the proof. 


4. European and American Options in a Diffusion (B, P)-Bondmarket 799 


3. It follows from (6) that the rational price P°(T°, T) can be determined from the 
‘initial’ prices P(0, T°), P(O, T), the constant K, and the quantity o(T®, T)B(T®, T), 
which, in its turn is defined by the coefficients @(s) and y(s) for T? < s <T. 

In the case of the Vasi¢ek model we have (@(s) = s, y(s) = y, and it is easy to 


see that 
0 0 Y -B(T-T°)} 1 -26T° 1/2 
o(T’,T)B(T’,T) = —({l-e \(ga(-e )) : 


The initial prices P(0, T?) and P(0,T) can be found from the following formula 
(see (12) in § 4a): 
P(0,t) = exp{ A(0, t) — ro B(0,t)}, 


where 
1 7 B 2 2 z 
A(0,t) = 3 [1 e Bt Bt] |£ Tal iB (1 e Bt)? 


B(0,t) = zi —e FF). 


§ 4c. American Option Pricing in Single-Factor Gaussian Models 


1. We continue our discussion of single-factor Gaussian (B,P)-models. We found 
formulas for C°(T°,T) and P°(T°,T) in §4b; now let C*(T°,T) and P*(T°,T) 
be the corresponding rational prices of American (call and put) options. Here we 
assume that the exercise times belong to the class 


MI? = {r =7(w):0< rw) ST’, we 9}. 
The (B, P)-market in question is both arbitrage-free and complete, and in ac- 


cordance with the general theory of pricing in such models (see Chapter VI, § 2 and 
Chapter VII, § 5), 


*(T? = su exp[ — E u T,T)- K)" 
CONT) = sp E p|- [ rodu) PtT) - K) (1) 
and "s 
sa oa = su exp| — r(u) du — P(r, P 
SRT a p( f (u) du) (K P(r, T)) (2) 
Since 
P(T, T) = exp (A(T, T) —r(T)B(r, T)), (3) 


equalities (1) and (2) are related to standard optimal stopping problems of finding 
the suprema 


a 
sup Eexp(- Í r(u) au) G(r, T;r(T)) (4) 
remg’ 0 

for Markov processes r = (r(t))e<r and nonnegative functions G(7,T;r(r)). There 
exists a well-developed theory for such problenis (see, for instance, [441]). 
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2. The question of the value of C* (PUP) is very simple because, as it is also for 
standard call options in (B, S)-markets, the process 


(exp (- [re du) (P(t, T) — Os, 


is a submartingale and C*(T°,T) = C€°(T°,T) by Doob’s stopping theorem (we 
interpreted this equality in the following way in Chapter VI, §5b and in §3b of 
the present chapter: an American call option coincides in effect with a European 
option). 

Proceeding to the prices Per: T), we consider the variables 


Y*(t,r)= sup Eep- f r(u)du) G(r, Tr), (5) 


rem?’ 


where Ez p is averaging under the assumption r(t) = r, MT? is the class of stopping 
times T = T(w) such that t < r(w) < T°, and 


G(t, T;r(t)) = (K —P(,T))* = (K — exp(4(t, T) - r(t)B(t,T)))" 


with A(t, T) and B(t, T) as in formulas (13) and (14) in § 4a. 
We set 
CT = {(t,r): Y*(t,r) > G(t, Tir), O<t <T,r > 0} 


and 
DT = {(t,r): ¥*(t,r) = G(t,T;r), O<t < Tyr > O}. 


On the basis of the characterization of the prices Y*(t,r) as the smallest ex- 
cessive majorants of the functions G(t,T;r) (see [340], [363], [441; Chapter III], 
[467], [478]), we can show that there exists a continuous interface function r* = r*(t), 
t < T°, such that the domains CT and DT (of the continuation of observations and 
their stopping, respectively) have the following form: 


CT = {(t,r): r(t) er (fy 0O<t <T, r >0} 
and 
DT = {(t,r): r(t)>r*(t), O<t<T,r> 0}. 


Here the function Y* = Y*(t,r) and the interface function r* = r*(t) are solutions 
of the following Stephan problem: 


ƏY*(t,r) 


a EAr =r E e CT 
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as OY*(t,r) 1 a?Y*(t,r) 
LY*(t,r) = (a(t) - LAr) = + at oe 
Y*(t,r) = G(t,T;r) (6) 
in the domain DT, and we have the condition of smooth pasting on ap’; 
oY*(t,r) z OG(t, T:r) (7) 
r Inte*(t) Or ingre) 


We know of no precise analytic solution to this problem (as, incidentally, also 
in the case of (B, S)-models; see § 3c). At the same time, in view of the abundance 
of American options in real life one would like to have an idea on the difference 
between the price P*(T°,T) of an American option and the price P°(T°,T) of a 
European option, and on the behavior of the interface r* = r*(t), t < T°. 

Arguments similar to the ones we used in § 3d, while looking for relations between 
the prices of American and European options in a (B, S)-market, can be used for 
(B, P)-markets in our case and bring us to the following result (cf. (19) in § 3d): 
for 0 < t < T? we have 


Y*(t,r) = Y? (t,r) + Kf” Eefox(- fro au} r(s)1 (rs) > roy) } ds, (8) 


where 


Y(t,r) = Eż r exp(- [ro du) (K - P(T?,T))* 


Etr epf- [ro du) (x - exp(A(T?, T) — r(T°)B(T°,T))). (9) 


Using formulas (12) and (13) in §4b and the expressions there for pug, fin,..., 
after siniple transformations we obtain 


Y*(t,r) = Y%C(t,r) 
TO 
+K | P(t, s){ ®(—v*(t, s)) f(t, s) + o(t, s)p(v*(t, s))} ds (10) 
forO<t< To where 
o*(t, s) = D(r(s)|r(t) =r), 


f(t,s) = ae In P(t, $), 


Os 
Aeae r*(s)— f(t. s) 
(s) a(t, s) ' 


(See a detailed derivation of (8) and (10) in [257].) 
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In particular, since 
P*(T°,T) = Y*(0,ro) and P°(T°,T) = Y?(0, ro), 


it follows that 
T? 
P*(T?, T) = P(T}, T)+K i; P(0, s){ ®(—v*(0, s)) f(0, s) +0 (0, s)p(v* (0, s)) } ds. 


In conclusion we point out that (10) enables one to find (by means of backward 
induction), at least approximately, also the values of the price P* (T°, T) and the 
interface function r* = r*(t), t < T°. 

As regards various methods of American option pricing (including the ‘case with 
dividends’), see, for instance, [28], [29], [56], [57], [179], [257], [376], [478], or [479]. 
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Dow (DJIA) 15, 376 
S&P500 (Standard & Poor’s 
500) 15, 376 
stability 336 
tail 336 
Inequalities 
Doob 250 
Kolmogorov—Doob 250 
Infinite time horizon 751 
Interday analysis 315 
Interest rate 7, 278, 279, 291 
Intermediaries 5 


Kolmogorov’s axiomatcs 81 
Kurtosis 88 


Law 

of 2/3 234 

of large numbers 109 

of large numbers strong 134 

of the iterated logarithm 247 
Lemma conversion 438 
Leverage effect 163 
Leptokurtosis 329 
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Linearly independent systern 548 


Local 
absolute continuity 433 
drift 238 
law of the iterated 
logarithm 246 
Logistic map 177 
Long 
memory 558 
position 24, 28 


Main formula of hedge 
pricing 505 
Majorant 
smallest af-excessive 616 
smallest G-excessive 754, 782 
smallest (@,c)-excessive 534 
smallest excessive 533 
Margin 24 
Market 
N-perfect, perfect 399 
arbitrage-free weak, strong 
sense 412 
arbitrage-free 411 
complete 399, 505, 535, 661, 
704, 728 
currency exchange (FX- 
market) 318 
imperfect 399 
large 553 
semi-strongly efficient 41 
strongly efficient 41 
weakly efficient 41 
Markov property 242 
Markov time 114, 324 
Martingale 41, 42, 89, 95 
difference 42, 96, 157 
difference, generalized 97 
generalized 97 
local 96 


local purely discontinuous 306 

square integrable 92, 296, 298 

transformation 98 

uniformly integrable 96 
Maturity date 9 
Maximal inequalities 250 
Maximum likelihood method 133 
Mean square criterion 520 
Measurable selection 428 
Measure 

P-o-finite 665 

absolutely continuous 433 

dual 744 

equivalent 433 

homogeneous Poisson 664 

Lévy 195, 202-206, 671 

local martingale 683, 719 

locally equivalent 433 

martingale 413, 459, 666 

minimal martingale 459, 585 

optional 665 

Poisson 664 

random 664 

Wiener 236 


Mixture of Gaussian 
distributions 217 

Model 
affine 292 
AR 125, 150, 288 
ARCH 63, 106, 153, 160, 288 
ARDM (Autoregressive 
Conditional Duration 
Model) 325 
ARIMA 118, 138, 141 
ARMA 105, 138, 140, 151, 288 
Bachelier linear 284, 735 
Black-Derman-Toy 280 
Black-Karasinski 280 
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Black-Merton-Scholes 287, 712, 
739 

chaotic 176 

Chen 280 

conditionally Gaussian 103, 153 
Cox-Ingersoll~Ross 279 
Cox-Ross—Rubinstein 
(CRR) 110, 400, 408, 478, 
590, 606 

Dothan 279 

dynamical chaos 176 
EGARCH 163 

GARCH 63, 107, 153, 288 
Gaussian single-factor 794 
HARCH 166 

HJM 292 

Ho-Lee 279 

Hull-White 280 

linear 117 

MA 105, 119, 148, 288 
MA(oo) 125 

Merton 279 

non-Gaussian 189 

nonlinear stochastic 152 
Samuelson 238, 705 
Sandmann-Sondermann 280 
Schmidt 283 


semimartingale complete- 
ness 660 

single-factor 292 

stochastic volatility 108, 168 
Taylor 108 

TGARCH 163 

Vasicek 279 

with discrete intervention of 
chance 112 

with dividends 748 


Modulus of continuity 246 
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Negative correlation 49 
Noise 

black 234 

pink 234 

white 119, 232 


One-sided moving averages 142 
Operational time 116, 358 
Opportunity for arbitrage 411 
Optimal stopping time 525, 528 
Option 22,26 

American type 27, 504, 608, 

779, 788, 792 

Asian type 626 

call 27, 30 

European type 27, 504, 588, 

735, 779, 788, 792 

exotic 604 

put 27,31 

put, arithmetic Asian 31 

put with aftereffect 31 

Russian 625, 767 

with aftereffect 625 


Parameter 
Hurst 208 
location (u) 191 
scale (o) 191 
skewness (3) 191 
Pay-off 514, 660, 709 
Phenomenon 
absense of correlation 50 
‘cluster’ 364 
Markowitz 49 
negative correlation 49, 355 
Point process 
marked 323 
multivariate 323 
Portfolio 
investment 32, 46, 385, 633 
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self-financing 386 
Position long, short 22, 28 
Predictability 89, 278 
Prediction 118 
Price 
American hedging 539, 543 
forward 523 
futures 523 
perfect European hedging 506 
rational (fair, mutually 
appropriate) 31, 398 
strike 31 
tree 500 
upper, lower 395 
Principle reflection 248, 760 
Problem 
Cauchy 274 
Dirichlet 276 
free-boundary 755 
Stephan 755, 764, 773, 783 
Process 
adapted 294 
Bessel 240 


Brownian motion with drift and 


Poisson jumps 671 

cadlag 294 

counting (point) 115, 323 

density 434 

diffusion-type 678 

discounting 644, 744 
Process 

Hellinger 555, 677 

innovation 680 

interest rate 717 

It6 257 

Lévy 200 

Lévy a-stable 207, 208 

Lévy purely jump 203 

multivariate point 115 


Orustein-Uhlenbeck 239 
Poisson 203 
Poisson compound 204 
predictable 297 
quasi-left-continuous 703 
stable 207 
stochastically indistinguish- 
able 265 
three-dimensional Bessel 102 
Wiener 201 
with discrete intervention of 
chance 112, 114, 322 
with intermittency (antipersis- 
tence, relaxation) 232 
zero-energy 350 

Property 
ELMM 649. 652 
EMM 649, 652 
EoMM 656 
NA,, NA}, NAq 651, 652 
NA,, NA; 652 
NFFLVR(NA,) 651 
NFLVR(NA,) 651 
no-arbitrage 649, 650 

Pure uncertainity 72 


Quadratic 
covariance 303, 310 
predictable variation 92 
variation 92, 247, 303 
Quantile method 331 


Radon-Nikodym derivative 434, 
561 

Random 
process self-siinilar 226 
vector stable 200 
vector strictly stable 200 
walk geometric 609 

Range 222 
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Rank tests 331 
Rational 
price 31, 399, 545, 591, 595, 
736, 740, 750 
time 545 
Reinsurance 75 
Representation canonical 662, 668 
Returns, logarithmic returns 83 
Risk 
market 69 
systematic 50, 54 
unsystematic 50, 54 


Scheme of series of n-markets 553 
Second fundamental theorem 481 
Securities 5 
Selector 429 
Self-financing 386, 633, 640 
Self-similarity 208, 221 
Semimartingale 292 
locally square integrable 669 
special 302, 669, 686 
Sequence 
completely deterministic 145 
completely nendeterministic 145 
innovation 145 
logistic 184 
predictable 89 
regular 144 
singular 144 
stationary in the strict 
sense 127 
stationary in the wide 
sense 121, 127 
Shares (stock) 13,383 
Short position 22, 28, 394 
Simple interest 7 
Smile effect 286 
Solar activity 374 


832 


Solution 
probabilistic 275 
strong 265, 266 
Specification 
direct 12, 292 
indirect 12, 292 
Spectral representation 146 
Spread 322, 604, 606 
Stability exponent 191 
Stable 189 
Standard Brownian motion 201 
Statistical sequential analysis 783 
Statistics of ‘ticks’ 315 
Stochastic 
basis 82. 294 
differential equation 264 
differential 440 
exponential (Doléans) 83, 244, 
261, 308 
integral 237, 254, 295, 298 
partial differential equation 713, 
746 
process predictable 297 
Stopping time 114 
Straddle 604 
Strangle 604 
Strap 605 
Strategy 
admissible 640 
in a (B,P)-narket 719 
perfect 539 
self-financing 386, 640 
Strip 605 
Submartingale 95 
local 96 
Subordination 211 
Supermartingale 95 
local 96 
smallest 528 
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Theorem 
Doob (convergence) 244, 435 
Doob (optional stopping) 437 
Girsanov 439, 451, 672 
Girsanov for semimartin- 
gales 701 
Lévy 243, 309 
Lundberg-Cramér 77 
on normal correlation 86 
Time 
local (Lévy) 267 
operational 213, 358 
physical 213 
Transformation 
Bernoulli 181 
Esscher 420, 672, 683, 684 
Esscher conditional 417, 423 
Girsanov 423 
Transition operator 530, 609 
Triplet (B,C,v) 195, 669 
Triplet of predictable characteris 
of a semirnartingale 669 
Turbulence 234 


Uniform integrability 96, 301, 518 
Upper price of hedging 395, 539 


Value of a strategy (investment 
portfolio) 385, 633, 719 
Vector 
affinely independent 497 
linearly independent 497 
stochastic integral 634 
Volatility 62, 238, 322, 345, 346 
implied 287 


White noise 119, 232 
Wiener process 38, 201 


Yield to the maturity date 291 
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Aoc 307, 367, 638, 686 


Che) 
CT? T) 
Cy 
Cr(d;r) 
Cr 


Cov 


799 

795 

27, 31 
749, 750 
19, 740, 741 
746 

506 

709 

781 

780 

123 

86, 123 
18, 316, 329 
227 

781 

780 

664 

83, 84 
649 

649 

40 

40, 81 
40, 82 
241 


299, 
290, 


H 208, 227, 


Law(hn) 
Law(hn | Fn-1) 
MY, ME 
MF, Mo 

M 


254, 


7ol, 


241 
242 
114 
114 
259 
639 
291 
291 

98 
666 
396 
353 
663 


663 
298 
298 
298 
331 
298 

88 
103 
103 
526 
752 

96 
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